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Cloud 


OG Here Comes the Cloud 
By Diego Montalvo 

In the past couple of years the term “cloud” has been 
on every tech news headline from companies offering 
“cloud” computing or start-ups running killer “cloud” based 
services. After a lot of thinking and rewriting | decided 
“cloud” can mean different things, it all depends on how it 
is being referenced. For example “cloud” by itself is short 
for Internet or long for web. (...) Cloud-computing has also 
began to infuse itself into different BSD flavors: Amazon’s 
Elastic Compute Cloud (EC2) offers users the choice of 
FreeBSD and NetBSD AMIs (Amazon Machine Instances). 
Where as RootBSD offers users a choice of BSD on Xen 
based virtual private services with full technical support. 


Developers Corner 


40 Developing Applications Using mport 
By Caryn Holt 

In the February issue, you saw how to use the mport 
system as an end user. In MidnightBSD, you can access 
the mport features as a developer and add support for 
mport to your existing C, C++ or Objective C application. 
(...) The mport library allows developers to integrate 
some or all of mport features into their own applications 
without calling exec(). This article is just an introduction to 
the many features available to MidnightBSD application 
developers. | would recommend going to the MidnightBSD 
website for further information. 


BSD Certification 


44 Taking the BSDA Certification Exam 
By Dru Lavigne 

The first article in this series (in the February 2012 
issue) addressed some common misconceptions 
about certification and described why you should be 
BSDA certified. The second article in this series (in the 
March 2012 issue) discussed how to prepare for the 
BSDA certification exam. This article will provide some 
background information on how the exam is delivered 
and why. 


Get Started 


18 Installing OpenBSD 5.0 on VMware Server 


By Toby Richards 
We're going to install OpenBSD 5.0. With the information 
in this article, you'll learn how to install it both on your own 


www.bsdmag.org 


computer and via VMware Server. For my example, I'll 
use my own Bsdvm.com account. 


How To 


29 Installing FreeBSD on Amazon AWS EC2 
Cloud Services 
By Diego Montalvo 
| have had an AWS account since Amazon first introduced 
the Elastic Compute Cloud (EC2). But to be honest back in 
the day it was fairly cryptic to get an instance running, the 
AWS web interface was in it’s infancy and documentation 
was limited. 


28 PostgreSQL: Replication 
By Luca Ferrari 

In the previous articles we saw how to set up a PostgreSQL 
cluster, how to manage backups (either logical or physical) 
and how internally transactions work. In this article we will 
see how it is possible to replicate a running cluster to another 
instance in order to have a fully mirrored and active “stand- 
by” node. Most of the configuration will be done by simple 
shell scripts in order to both show required instructions and 
to allow readers to replay several time the experiments. 


Interview 


4.6 Interview with Mark Price 
By Diego Montalvo & BSD Team 

| don’t think cloud is anything specifically new. | think 
its just a term that describes the trend in businesses 
outsourcing more IT functions. My experience with IT 
people in general is that IT people are very possessive 
and territorial, wanting to have lots of servers doing lots 
of things in-house. The ‘cloud’ idea just says “OK, we are 
going to outsource some of this boring technology stuff 
and instead concentrate on what's really important to us’. 


Let’s Talk 


4.8 The Greater Benefits of Open Source 
Software 
By Paul Ammann 
In contrast to proprietary software produced by most 
commercial manufacturers, Open Source software is 
written and perfected by volunteers, who freely share the 
programming code that would otherwise be kept secret. 
(...) Let’s address Open Source as a market phenomenon, 
stating some of the basic facts and seeking to clarify some 
misconceptions that have emerged in recent treatment of 
the issue. 
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Here Comes the Clouds 


In the past couple of years the term “cloud” has been on 
every tech news headline from companies offering “cloud” 
computing or start-ups running killer “cloud” based services. 
After a lot of thinking and rewriting | decided “cloud” can 
mean different things, it all depends on how it is being 
referenced. For example “cloud” by itself is short for Internet 


or long for web. 


“computing” and “services”. “Cloud computing” is a 

marketing term for virtual hosting services, which 
allows virtual administration via an Internet connection. 
“Cloud services” is a fancy phrase for services on the 
Internet. 

In 2006 cloud computing began it’s popularity with the 
debut of Amazon's Elastic Compute Cloud (EC2) which 
began offering virtual servers to a mass audience. 

Cloud services have been around since the common 
non-techie even knew what email was. As early as 1996 
Hotmail (running on FreeBSD) introduced what initially 
was a cloud-based email system. 

Today cloud computing is offered as “virtual machines” 
or “instances” which are both scalable and remotely 
accessed via a SSH, VNC client or a customized Web 
Interface of sorts (Figure 1). 


think it is important to differentiate between cloud 


“Cloud computing describes a new supplement, consumption, and 
delivery model for IT services based on Internet protocols, and it 
typically involves provisioning of dynamically scalable and often 
virtualized resources.” Wikipedia 


Back in 2000 | remember thinking how cool it would be 
to build a web based word processor such as Word. 
Even though it was a good idea back then, it is only 
today that both sophisticated clouds and web technology 
make it feasible. Once prominent desktop productivity 
software has been seeing it’s demise due to web based 
counterparts, examples Google Docs replacing the 
software office suite. 

Just as web services have began to replace software, 
cloud services have began to replace small server farms 
and on-site server administrators. In today’s fast paced 
world why purchase more servers and pay a server 
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Figure 1. Amazon and HP Cloud Computing Web Interfaces 
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administrator when you can simply scale your web 
services with a few clicks and a credit card. 


Cloud computing is to hardware as cloud services is to 
software. Both are evolutions of technology and both 
are replacing there counterpart. 

Today’s Internet consists of text, images, video, music, 
on-demand video, real-time communication, instant stock 
quotes and more. Even though the Internet basics such 
as HTML are still the same, what has changed is the 
magnitude of information being exchanged every second 
of every day. Data storage has increased enormously. 
Once upon a time a single server had enough capacity 
to manage a popular website. Today’s popular websites 
run on hundreds if not thousands of servers and are 
constantly upgrading equipment. Since the inception of 
the Internet it has always been fantasized that Network 
computers would replace software based computers, it is 
happening now with cloud computers and cloud services. 

With the advancement of Internet technology and 
advanced web services, more and more users are 
becoming more reliant on cloud services than on 
computer software. 

Cloud services can be found in the smallest of devices. 
Smart mobile devices and most mobile applications rely 
on some sort of data push or data request from the cloud 
to add application functionality (Figure 2). 

Most mobile carriers currently offer secure cloud 
services for storing a mobile user’s contacts and other 
stored information. Recently Apple Inc. has been pushing 
iCloud, a backup service which pushes users device 
content to all Apple devices. 

Cloud-computing has also began to infuse itself into 
different BSD flavors: Amazon’s Elastic Compute Cloud 
(EC2) offers users the choice of FreeBSD and NetBSD 
AMIs (Amazon Machine Instances). Where as RootBSD 
offers users a choice of BSD on Xen based virtual private 
services with full technical support. 

In 2006 Amazon Web Services (AWS) released EC2 
which opened up Amazon's own server infrastructure as 
a way of making use of it’s unused server capacity. EC2 
allows users to choose from hundreds of AMIs including 
different tier offerings of FreeBSD and NetBSD. EC2 also 
provides users with an administration web console. AMI 
customization can be done using any SSH client such as 
PuTTY or terminal. 

RootBSD is a dedicated BSD virtual server provider 
which specializes in FreeBSD but offers other BSD 
distributions upon request. RootBSD offers users flat-rate 
monthly plans. RootBSD virtual servers are Xen based 
and allow users to administrate services using any SSH 
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Figure 2. Mobile Applications and Cloud Interaction 


client. RootBSD offers technical support for all VPS or 
custom plans. 

BuildaSearch (BaS) a web service | founded is 100% 
cloud based, running on FreeBSD Xen based virtual 
servers. BaS allows users to crawl and index thousands 
of pages in real-time. Cloud computing allows BaS to 
scale services with a few clicks. 

DuckDuckGo (DDG) a search engine which emphasizes 
privacy, uses a cloud infrastructure to power it’s growth. 
DDG results are a compilation of many sources, including 
Yahoo! Search BOSS, Bing, Wikipedia, Wolfram Alpha 
and its own web crawler. DDG uses FreeBSD for crawling 
and team coordination (Figure 3). 

Cloud-computing in the past 6 years has grown from 
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an experimental concept at Amazon to a viable business 
model which is reshaping information technology as a 
whole. Today cloud computing powers sites such as 
Twitter, DuckDuckGo, Digg, Crunchbase, BuildaSearch 
and even Amazon's own enormous market-place. 

In reality the first clouds are already here, beginning to 
influence our everyday lives from accessing social sites to 
filing taxes over the web. As Internet technologies evolve, 
bigger and more sophisticated clouds will appear. Even 
though the clouds today seem large, they are initially just 
the beginning to a world full of clouds which in the not to 
distant future will include a cloud toaster and crock-pot 
which tweets when the roast is ready. 


DIEGO MONTALVO 

Diego Montalvo is the founder of BuildaSearch.com a (site 
search engine web service) as well as Urloid.com a (URL 
shortening service). Diego by trade is a web developer and 
technical writer who enjoys the beach, socializing, staying up 
late, exercising and an drinking an occasional pint. Feel free to 
contact Diego at diego@earthoid.com. 
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Developing 


Applications Using mport 


In the February issue, you saw how to use the mport system as an 
end user. In MidnightBSD, you can access the mport features as a 
developer and add support for mport to your existing C, C++ or 


Objective C application. 


What you will learn... 
« Add basic mport management to existing C, C++ or Objective C 
applications 


hen you start to work with mport, any function 
\\V v= is marked as MPORT PUBLIC API is 
considered safe for external applications to 
use. Feel free to use any function however the public 
API functions are the only ones guaranteed to work. To 
see if a function is marked as public, please look at it’s 


implementation. 


Listing 1. Creating and initializing mport instance 


mportinstance *mport 


POSE. — MeOul Instances wew 


Ff (MpPOLE NI NS tance ae NpOte Uli) SMO nh Ok 


Listing 2. Looking up package name 


const char *packageName = "test pkg" 


mportiIndexEntry **indexEntries 


if (mport index lookup pkgname(mport, packageName 





What you should know... 
¢ C,C++ or Objective C 


The include file is mport.h. 

Initially, the mport API was developed to allow 
MidnightBSD developers to rapidly create new mport 
tools. 

However as_ discussions about new port tools 
continued that eventually other developers would want 
to add mport functionality to their applications. 


CindexEn tries) —'— SME Onl OK 
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Developing Applications Using Mport 





At this time, MidnightBSD only has a C mport API Initializing mport 
however we would like to add better scripting support. In For any mport call, you need to create a new mport_ 
particular, we want to add Python support. instance. In the sample code, | created a mportilnstance 





Listing 3. Reading and display mport information 


int info(mportInstance *mport, const char *packageName) { 
mportIindexEntry **indexEntry; 
mportPackageMeta **packs; 


Chatr Stalls pa OG lenm: 


if (packageName == NULL) { 
fprintf(stderr, "Specify package name\n") ; 


return 1; 


indexEntry = lookupIndex(mport, packageName) ; 
if (indexEntry == NULL || *indexEntry == NULL) { 
fprintf(stderr, "Ss not found in index.\n", packageName) ; 


return 1; 


EE (MOP E Pp Kemieta scare master (MpOrt,  ipacks,, PkKe-.0 7; packageNene) | — POR h OK) ay 
Wale oS, NPOmt enim (sit miimg() i; 


return 1; 


if (packs == NULL) { 
Seu = Way. 


Origine. ot 

} else { 
status = (*packs) ->version; 
Origin — (packs =~ origin; 


printf ("Ss\nlatest: %s\ninstalled: %s\nlicense: %s\norigin: %s\n\n%s\n", 


(*indexBniry)->version, 
Status, 
(*indexEntry) ->license, 
Cicalopiliciy 


(*indexEntry) ->comment) ; 


MPOL EINE xe Clu Ve bsceu vec (IMNCe Elie tyr, 


return 0; 
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pointer and creating the instance by calling mport_ 
ihe caneen meow ( cs 

After | created the mport instance, | initialized it using 
lf the call was successful, it will 


MOE ees teanee: Amis (i. 


return MPORT_OK. 


Working with the mport index 
For several tasks, you will first need to load the mport 
index. To load the index call mport index load(). 


tee sl ee PCr, Mme cel Oad (import); 


A successful index load will return MPORT_ OK. 

To install, update or delete a package, you will need to 
lookup the package in the index. 

In later code examples, you will see how to use the 
index entries. The structure for mportlndexEntry is 
defined in mport.h. To free an index entry, USE mport index 


Cilieiny merece VCC): 


Installing and deleting packages 
You will need to have loaded the mport index to install a 
port. 


const char *packageName 
Chaee DU, Gaccacerarn, 


MPO Einde xml hy “Inoexkntry; 


After looking up the package in the index, construct 
the package path. mreorr socan pxc_ pata IS the local 
downloads location. For default installations, it should be 
/var/db/mport/downloads. 


asprinue (spackagePaun, 7 os/o5 , MPORT BOCAL PKG PATH, 
(*indexEntry) ->bundlefile) ; 


To check if the package exists locally, call mport _ 
package exists): 

lf you need to download the package, Call mport fetch_ 
bundle(). If the download was successful, MPORT_OK is 
returned. 
NeCee rete eUnale(ipoLit ~ indexntry)>-ound leile) 
Additionally, you can verify the package by calling 
mport verify hash(). The hash for the package is in the 
mportindexEntry. 

To install the package, Call mport install primative(). The 
fUNCtION mport delete primative() removes a package. 
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On the Web 


http://www.midnightbsd.org — MidnightBSD project's website 
http://www.midnightbsd.org/documentation/mports/ — mport 
documentation 








Glossary 
mport 








Updating an existing package 
You can update the package and optionally any 
dependencies. For this example, we'll only be updating 
the package. 

Similar to installing a package, you would lookup the 
index entry and construct the package path. Instead of 
calling mport install primative(), YOU Would Call mport_ 


update primative @m 


Other useful mport functions 

Another important mport structure is mportPackageMeta. 
This structure contains additional information such as 
langauge and package categories. Also to get up or down 
dependencies (mport pkgmeta_ get updepends () and mport_ 
pkgmeta_ get _downdepends () ) for a given package, you will 
need to have the meta information. 

The mport library includes two error reporting functions 
— mport_err_code() Which returns the error code as an 
Integer and mport_err_string() which returns the error 
message. 


Summary 

The mport library allows developers to integrate some or 
all of mport features into their own applications without 
calling exec (). This article is just an introduction to the many 
features available to MidnightBSD application developers. 
| would recommend going to the MidnightBSD website for 
further information. 


CARYN HOLT 
Caryn Holt is a MidnightBSD developer and software engineer 
for Roviin Ann Arbor, Michigan. 
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Taking the BSDA 


Certification Exam 


The first article in this series (in the February 2012 issue) addressed 
some common misconceptions about certification and described why 
you should be BSDA certified. The second article in this series (in the 
March 2012 issue) discussed how to prepare for the BSDA certification 
exam. This article will provide some background information on how 
the exam is delivered and why. It will then describe where to take 

the exam and how to arrange for an exam if there currently isn’t an 
examination event or testing center near your location. 


in February, 2008. The BSD Certification Group 
(BSDCG) had several 
launching this examination: 


qe BSDA certification exam became available 


goals in mind when 


¢ Maintain the psychometric validity of the exam. 

¢ If possible, use BSD operating systems and open 
source software for exam delivery. 

¢ Keep the exam price as globally affordable as possible. 

¢ Make the exam available to anyone, regardless of 
their location. 


Since these goals impact on how the exam is delivered, 
let's take a closer look at each: 


Maintain the Psychometric Validity of the Exam 
Assessing practical, real-world system administration 
skills is an integral component of the BSDA examination. 
A lot of work goes into the exam creation process to 
ensure that the resulting certification is psychometrically 
valid. To maintain the validity of the examination, certain 
requirements need to be followed when the exam is 
taken. For example: 


¢ the identity of the person taking the exam must be 
verified using government issued, photo identification. 
This is to ensure that the person taking the exam is 
who they say they are and to prevent someone from 
taking the exam for someone else. 
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e the person taking the exam needs to be monitored 
during the exam to ensure that they dont have 
access to additional information sources, tamper with 
the exam materials, copy exam questions, or remove 
exam materials from the exam room. 


An assessment is not accurate if the person is not who 
they claim to be or if a person who doesn't have the 
skills needed to pass the exam finds a way to cheat the 
assessment. This is the reason why exam candidates 
need to have their ID checked and why their activity must 
be monitored when taking the exam. The person doing 
the checking and monitoring is the proctor and they must 
be trusted by the organization which provides the exam. 
The requirement to use a proctor places restrictions on 
how the exam can be delivered. For example, we often 
hear the question “why can’t | take the exam online from 
home?”. This type of exam delivery is hard to proctor for 
several reasons. Verifying the person's ID requires either 
a photocopy or viewing the ID on a webcam, making it 
difficult to obtain a clear image or to spot a counterfeit. 
Monitoring is not as reliable: a webcam can be pointed at 
the exam taker, but it won't notice if the person is referring 
to notes outside the camera view, has found a way to 
subvert the exam application and access additional 
resources such as an Internet search in a browser, or is 
running screen capture software to copy the contents of 
the exam. It also requires the proctor to have access to 
the camera view or to the user’s desktop session — this 
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isn't as scalable as having a proctor monitor a room full of 
examinees in person. 


Use Open Source Software, Keep the Exam 
Price Globally Affordable, and Make the Exam 
Available to Anyone 

When it comes to exam delivery, these three goals are 
related. Commercial solutions which provide proctored 
testing centers throughout the world exist (the best known 
examples are VUE and Prometric), but they can be 
problematic for several reasons: 


¢ the testing software does not use open source 
software (it is usually Microsoft Windows, flash, 
and Internet Explorer based). This limits the types 
of exam questions and makes providing interactive 
lab scenarios difficult. While it is possible to create 
interactive flash scenarios, these are expensive to 
develop and don't provide the same flexibility as an 
operating system running in a virtual environment or a 
FreeBSD jail. 

¢ these solutions assume high delivery volume and 
charge accordingly, making it difficult to provide an 
affordable exam. For example, there is an annual fee 
(typically in the high, five-figure US dollar range) that 
must be paid every year, regardless of the number 
exams delivered. If the testing organization doesn't 
deliver enough exams to cover their annual fee, a 
financial loss is incurred for that year. 

* a psychometrically valid exam undergoes constant 
Statistical analysis to determine if any questions 
need to be modified (e.g. they are determined to be 
too easy or too hard). Commercial solutions charge 
a publication fee (typically in the low, four figure US 
dollar range) whenever exam questions are changed 
or whenever new exams become available, providing 
a financial dis-incentive for keeping the exam up-to- 
date or providing additional versions of an exam. 

¢ the larger testing companies tend to have testing 
centers located in most countries throughout the 
world, but charge the largest annual fee. Smaller 
companies charge a lower annual fee, but tend to 
have good North American coverage and limited 
locations in other parts of the world. 


When determining if a commercial test delivery solution 
is a good match for an organization that provides an 
exam, one needs to balance the number of people 
expected to take the exam in a year, where those people 
are located, and how much they can afford to pay to take 
the exam. If the number of examinees is low (less than 
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several thousand per year), examinees tend to live in 
areas that aren't near a testing center, or examinees tend 
to be in countries where an exam fee in the $200 USD 
range (the average price for an exam) is unaffordable, 
alternative ways to deliver the exam need to be explored. 

Before the launch of the BSDA exam, the BSDCG 
launched a Test Delivery Survey to help determine 
the testing needs of the BSD system administrators 
community. The report on that survey is available at http:// 
www. bsdcertification.org/downloads/delivery_survey.pdf. 
One of the conclusions of the report is that “70% of testing 
candidates are unwilling to pay more than $100 USD to 
take the exam; the cost of using a test delivery solution 
will have to fall below this price point and still allow for the 
costs of psychometric analysis and administration of the 
BSDCG organization”. The survey results also indicated 
that “testing candidates are scattered throughout the 
globe” (58 countries were represented in the survey) and 
that “the majority of testing candidates are willing to travel 
to take the exam’. 

When the BSDA was launched in 2008, there weren't any 
existing open source test delivery solutions and the annual 
fee imposed by the existing commercial solutions was 
beyond the starting budget of the BSDCG. The decision 
was made to offer a paper based version of the exam at 
hosted events around the world and to further research the 
feasability of either a home-grown open source solution or 
a more reasonably priced commercial solution. 

At this time, there are now two delivery methods for 
taking the exam: a paper based exam at an exam event or 
a computer based exam at a testing center. The content 
of the BSDA is the same, regardless of the delivery 
method. 


Paper Based Exams 
The first exam event was held during the Southern 
California Linux Expo in 2008. Since then, over 120 exam 
events have been organized at technical conferences, 
schools, and places of employment throughout the world. 
You can view the upcoming and past events at https:/ 
register.bsdcertification.org//register/events. 

Taking the paper-based exam at an exam event offers 
several advantages: 


e it provides the opportunity to meet and network with 
other system administrators who are also interested 
in BSD 

¢ since the costs to deliver the exam are primarily the 
cost of shipping the exams to and from the event, the 
price of the exam can be kept at the mostly global 
affordable price of $75 USD 
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There are also some limitations to this method of exam 
delivery: 


e there are only so many events and locations per year. 
* organizing and advertising upcoming exam events 
relies on the assistance of the community. 


Efforts by a local community to arrange and promote exam 
events in their geographic area directly impact on the 
success of an exam event. In turn, successful exam events 
benefit the local system administrator community and help 
to promote the use of BSD in that geographic area. 


Arranging for an Exam Event 
There are several advantages to arranging an exam event 
in your city: 


¢ you don't have to wait until an exam event comes to a 
location near you. 

* aS a member of your geographic community, you 
have a bettter idea of which local resources are 
available for hosting an exam event. 

e it provides a networking opportunity to find and 
associate with other BSD system administrators. 

¢ the time leading up to the event provides a study 
opportunity to meet in person and help each other 
learn the exam objectives. 


If you are interested in seeing an exam event organized 
in your city, check to see if your employer, a loca | 
educational institution or training center, your local user 
group, or an upcoming technical conference is interested 
in hosting an exam event. 

In order to host an event, the interested organization 
needs to be able to provide: 


¢ a quiet room that can comfortably sit 6-8 people, 
not too close together. Internet is not needed as the 
exam is paper based. A suitable room is typically 
easy to arrange with an employer or school. If 
the event is being organized by a user group that 
doesn't have their own facility, eheck with the local 
library or city hall to see if it is possible to reserve a 
room in a municipal building. If you are contacting a 
conference organizer, ask if you can reserve one of 
the conference rooms for a period of 2 hours either 
during the lunch hour or at the end of the conference 
day. 

¢ a trusted person to act as the proctor. The person to 
act as proctor should either be: known in the BSD 
community, a teacher at an accredited educational 
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institution, a trusted employee, or a speaker at a 
conference. The proctor will be required to adhere to 
an NDA to protect the integrity of the exam and can 
not be certified in the exam that they are proctoring. 
Depending upon the location, the BSDCG may 
already know of a proctor who lives close by or who 
is able to travel to the event. The BSDCG can also 
assist you in finding a suitable proctor. 

¢ 6-8 weeks notice to give the BSDCG time to advertise 
the event and ship the exams. 

¢ aim for at least 4 people interested in taking the exam 
at the event. Talk to your coworkers, fellow students, 
user group members, or use social media to see if 
you can drum up some interest. 


Once you find an organization willing to host an exam 
event, contact chair@bsdcertification.org with the details 
about the location and date. The BSDCG will work with 
you to make sure a Suitable proctor is available, that the 
event is added to the registration website and advertised 
through social media, and that the exams are shipped to 
arrive in time for the event. 


Computer Based Exams 

Beginning in April, 2011, the BSDCG partnered with SMT 
to offer a computer based version of the BSDA at IQT 
testing centers (you can read the press release with the 
details at http:/bsdcertification.org/news/pr061.html). This 
partnership provides several advantages: 


¢ you don't have to wait for an exam event as you can 
schedule your exam at any time. 

¢ you don't have to wait to receive your exam results as 
your score report is printed for you at the end of the 
exam. 


Depending upon where you live, there are some 
limitations to this exam delivery method: 


¢ most testing centers are in North America, though 
there are centers located outside of North America. 
The list of testing center locations is here: http://www. 
isoqualitytesting.com/mlocations.aspx. 

¢ the exam price is higher in order to cover the costs 
of using the testing center network. The price to take 
the exam at a testing center is $150 USD, which is still 
affordable in North America and Western Europe but 
which may not be affordable in some parts of the world. 


If you run a testing center or know of a testing center in 
your city who would like to be added to the IQT testing 
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network, contact chair@bsdcertification.org with the 
details. We are especially interested in adding at least 
one testing center in Russia as this country has many 
people interested in BSD certification. 


Registering for an Exam 
In order to take the BSDA exam you must first register 
for a BSDCG ID at https://register.bsdcertification.org// 
register/get-a-bsdcg-id. Once you have an ID, check the 
events page (https://register.bsdcertification.org//register/ 
register-for-an-exam) to see if there is an exam event or 
testing center location near you. If there isn't, let us know 
if an organization in your area is interested in hosting an 
exam so that we can add it to the events page. 

lf you select a paper exam event, the proctor will be 
notified of your registration so that they knows who to 
expect on exam day and so that they can notify you if 
there is a room change. 

lf you select a computer based exam, you will not 
be able to schedule your exam until after payment is 
received. After making your payment, a link will be 
emailed to you with the information that you will need to 
schedule an exam. You have up to one year to schedule 
the exam after making your payment. 

Most exam payments are made through PayPal. If you 
need an invoice for your payment or are unable to pay using 
PayPal, send an email to register@bsdcertification.org. 


Summary 

This article described the exam delivery methods for the 
BSDA exam, how to arrange for an exam event, and how 
to register for an exam. 

In the June issue of BSD Mag this series will continue 
by describing how exam questions are created and how 
interested system administrators can contribute to the 
exam creation and review process. 
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The BSD Certification Group Inc. 
(BSDCG) is a non-profit organization 
committed to creating and 
maintaining a global certification 
standard for system administration 
on BSD based operating systems. 


@ WHAT CERTIFICATIONS ARE AVAILABLE? 


BSDA: Entry-level certification suited for candidates 
with a general Unix background and at least six months of 
experience with BSD systems. 


BSDP: Advanced certification for senior system administrators 
with at least three years of experience on BSD systems. 
Successful BSDP candidates are able to demonstrate 

strong to expert skills in BSD Unix system administration. 


@ WHERE CANIGET CERTIFIED? 


We’re pleased to announce that after 7 months of 
negotiations and the work required to make the exam 
available in a computer based format, that the BSDA 
exam is now available at several hundred testing centers 
around the world. Paper based BSDA exams cost $75 USD. 
Computer based BSDA exams cost $150 USD. The price of 
the BSDP exams are yet to be determined. 


Payments are made through our registration website: 
https://register.bsdcertification.org//register/payment 


@_ WHERE CAN I GET MORE INFORMATION? 


More information and links to our mailing lists, LinkedIn 
groups, and Facebook group are available at our website: 
http://www.bsdcertification.org 


Registration for upcoming exam events is available at our 
registration website: 
https://register.bsdcertification.org//register/get-a-bsdcq-id 
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Installing OpenBSD 5.0 
on VMware Server 








We're going to install OpenBSD 5.0. With the information in 
this article, you'll learn how to install it both on your own 
computer and via VMware Server. For my example, I'll use 


my own Bsdvm.com account. 


What you will learn... 
¢ How to install a new instance of OpenBSD 5.0. 


s far as | know, Bsdvm.com is the only hosting 
provider that provides console access to your 
virtual private server (VPS). If you intend to install 


OpenBSD 5.0 on your own computer, then just skip past 
step 5. 


Step 1 


Get an account with Bsdvm.com. 


Step 2 

E-mail support@bsdvm.com, and ask them for two things: 
to insert the BSD 5.0 ISO into your Virtual Machine and to 
give you information for accessing the console. 


Step 3 

Configure Internet Explorer to accept SSL 2.0. These 
are limitations of VMware Server. The VMware plugin 
(a Windows .exe file) used to work with Firefox, but 
not since they started accelerating the incrementation 
of version numbers. VMware has also not upgraded 
their Server product to use anything better than SSL 
2.0. 


Step 4 

Browse to the address given by Bsdvm.com support staff, 
and log on. Figure 2 shows how to find the console screen, 
which will prompt you to install the console plugin. 
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What you need: One of 

« Acomputer and OpenBSD 5.0 CD or 

« A Bsdvm.com account and a Windows computer (a limitation of 
VMware Server) 

¢ Your own VMware Server 2.0 instance. 





hea | | 2 x] 

General | Security | Privacy | Content | Connections | Programs Advanced | 
settings 

| [_] Do not save encrypted pages to disk * 

["] Empty Temporary Internet Files folder when browser is dc 

Enable DOM Storage 

[4] Enable Integrated Windows Authentication™ 

Enable memory protection to help mitigate online attacks* 

[¥] Enable native XMLHTTP support 


aen Filter 






i od 

[¥] Use TLS 1.0 

[[] Use TLS 1.1 

[] Use TLS 1.2 il 

Warn about certificate address mismatch*™ 

[-] Warn if changing between secure and not secure mode x| 
a 





“Takes effect after you restart Internet Explorer 


Restore advanced settings | 


Reset Internet Explorer settings 


Resets Internet Explorer's settings to their default Reset... | 
lition. — 


You should only use this if your browser is in an unusable state. 














Figure 1. /nternet Explorer Settings 
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Step 5 

Notice the power off/pause/power on/reset buttons that 
have the universal icons, and are available to you. Once 
you have your console session going, use F2 during the 
POST to get into BIOS. Set the VM to boot to CDROM 
first. 

Now we're ready to start installing OpenBSD 5.0. 
Whenever there is a default answer, pressing ENTER 
without any input will accept that default answer (which 
you'll see in brackets) One by one, here are the questions 
that you'll be asked: 

(T)nstall, (S) hell? 
Answer “i” to install OpenBSD 5.0 


(U)pgrade, or 


Choose your keyboard layout. 
Most people can accept the default layout. Press “?” for 
more options. 


System hostname? 
The answer is up to you. 


Which network interface do you want to configure? [vicO] 
Press ENTER to accept the default of vico. 


You will get the answers to the following questions from 
support@bsdvm.com: 


¢ |IPv4 address 
¢ Netmask 
¢ |IPv6 address (probably none) 
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ai! VMware Infrastructure Web Access (toby@72.52.97.66) 
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Application Virtual Machine Administration Marketplace | Log 


asks Events Permissions 
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Inventory 
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Click anywhere to open the virtual machine. 


‘Target Status Triggered At Tngg 
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Figure 2. The Console Access Page 





www.bsdmag.org 








Now you'll be asked which other network interface you 
want to configure. Press ENTER to accept the default of 
none. 


Default IPv4 route? 
Get the answer from support@bsdvm.com. 


DNS Domain Name? 
That's up to you. 


DNS Nameservers? 
Get the answer from support@bsdvm.com. 


Do you want to do any manual network configuration? 
[no] 
No. 


Password for root account? 
That’s up to you. You'll have to confirm it. 


Start sshd(8) by default? [yes] 
Yes. That’s how we're going to access the system after 
the install is done. 


Start ntpd(8) by default? [no] 
Yes. This will keep your time synchronized. 


Do you expect to run the X Window System? [yes] 
No. Since this tutorial assumes that you have limited 
resources, we re not going to use X. 


Change default console to com0? [no] 
No. 


ME CMC lat RoR DL aed 


secur ity fl Boot & Exit 
Item Specific Help 


Keys used to view or 
configure devices: 
<Enter> expands or 
collapses devices with 
ae oe 

<Ctrl+Enter> expands 
all 

<Shift + 1> enables or 
disables a device. 

<+> and <-> moves the 
device up or down. 

<n> May move removable 
device between Hard 
Disk or Removable Disk 
<d> Remove a device 
that is not installed. 


WEOET a tLUPe Best 


Network boot from AMD Am?9C970A 


| et 
Esc vars 


To direct input to this virtual machine, press Ctr+G. 
Ss 


Figure 3. VMware Client BIOS 
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GET STARTED 


Setup a user (enter a lower-case login name or ‘no’) [no] 

Type in a user name here. Mine is “toby”. Later on, we'll 
secure the server so that the root user cannot login over 
SSH. 


Full username for user [toby] 
Your full name. | put “R. Toby Richards” here. 


Password for toby account? 
Type a password. You'll have to confirm it. 


Since you set up a user, disable sshd(8) logins for root? 


[yes] 
Yes. This is definitely a best practice for security. 


What timezone are you in? 
Use the “?” to find your time zone. 


Available disks are: sdO Which one is the root disk (or 
‘done’)? [sdO] 
sd0 


Now we get into disk setup. I’m not going to write out all 
the text you'll see. First press “w” to use the whole disk. 
Then you can press “a” to accept the default partition 
layout, but if you press “c” for a custom layout, then 





Listing 1. Single Partition Setup 
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[ BF Open_Lite RRichards VMware Remote Console Devices 


[X] bsd [X] etc5#@.tgz [X] xbase5@. tgz 
[X] bsd.rd CX] comp56é. tgz [ ] xetc5é@.tgz 

{[ 1] bsd.mp [CX] man5é. tgz [ ] xshare5é. tgz 
[X] base5@. tgz [X] game5@. tgz [ ] xfont5@.tgz 


[ ] xserv5@.tgz 


BSet name(s)? (or ‘abort’ or ‘done’) [done] 


bsd eee eee eee ee eee ee eee eee ere eee eS tt ew 
bsd.rd 
pbased@. tgz 
eee aa 
comp5@. tgz 
man5é. tgz 
game5@. tgz 
Bxbase5@. tgz 
Location of sets? (cd disk ftp http or ‘done’) [done] 
Saving configuration files...done. 
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Making all device nodes...done. 

Sinstall non-free firmware files on first boot? [nol] 
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SCONGRATULATIONS? Your OpenBSD install has been successfully completedt 
To boot the new system, enter ‘reboot’ at the command prompt. 

When you login to your new system the first time, please read your mail 
using the ‘mail’ command. 


# reboot_ 


Oe ee 





Figure 4. Final Steps of OpenBSD 5.0 Installation 


the procedure for using a single partition (plus a swap 
partition) is below. We'll use the following commands: 


z: remove all partitions 
¢ a: add a partition 

¢ m: modify a partition 

¢ w: write partition table 
qi quit 


My numbers are for a 10 GB drive and 512 MB RAM. | 
don’t want to deal with block numbers, so | first create 
a partition that uses the whole disk, then | resize that 
partition to make room for swap. You should have the 
Same amount of swap space as RAM. You'll see me 
use “-512M". You should adjust that number to suit your 
needs. 

Now I’ve got a 9.5 GB root partition, and a 512 MB swap 
space. Next, we get asked which packages to install. Like 
| said, we're not installing X, so when the list is presented, 
| type “-x*” to unselect the X packages. We do need 
the xbase package to use ports though, so | then type 
“+xbase*” You're pretty well done with setup. You'll see 
the system installing your selected packages, and then 
you can reboot. See Figure 4. 

Remember: when you reboot, you'll need to go back 
into BIOS with F2 to set the hard drive as the first boot 
device. 

At this point, you're done. Enjoy your shiny new 
OpenBSD server. 


TOBY RICHARDS 
Toby Richards has been a network administrator since 1997. 
Each article comes straight from the notes that he takes when 
doing a new project with *BSD. Toby recommends bsdvm.com 
for your hosting needs because they provide console access to 
your virtual machine. 
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NetBSD 


VPS HOSTING 


/ UNMETERED bandwidth 
/ VNC console 

/ instant reimaging 

/ native IPv6 network 

/ always latest BSDs 

/ competent support 


, http://bsdvm.com 
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Installing FreeBSD 





on Amazon AWS EC2 Cloud Services 


| have had an AWS account since Amazon first introduced 
the Elastic Compute Cloud (EC2). But to be honest back in 
the day it was fairly cryptic to get an instance running, the 
AWS web interface was in it’s infancy and documentation 


was limited. 


What you will learn... 

« Installing FreeBSD on Elastic Compute Cloud (EC2 

¢ Running a Virtual Server on the Cloud 

¢ Setting up a virtual server using PC-BSD or Windows 


running, | believe the pricing was abit steep for a 
lone ranger such as myself. 

Since the inception of EC2, folks wanting to run 
FreeBSD could not, due to some conflicts between the 
AWS XEN and FreeBSD. On December 2010, however, 
it was announced that FreeBSD was available on EC2. 
Moving forward to 2012, | founded BuildaSearch.com a 
site search service which you guessed it, is powered by 


E ven though | did get some sort of Linux instance 


Getting Started 
MNS Uanagement Console » Console Home 
as fiesta Geaneteth 43 EC? «38VPC (Cipettch 
1.) (Windows) Before we begin wesc 
with AWS services you will 
need to download “Pul Ty” and 
“PulTYgen” from the PulTY 
website. 
Set Start Page 


2.) Log into your AWS account 
and click on the EC2 Virtual 
Servers in the Cloud link. 

(Figure 1) 


Console Home 
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Amazon Web Services 


What you should know... 

« Basic knowledge of Terminal or PUTTY 
« chmod shell commands 

- Basic knowledge of ssh 


FreeBSD. Anyways knowing that FreeBSD could now run 
on EC2, | decided to build a customized EC2 instance so 
that | can test out some new BuildaSearch features. 

This tutorial will provide the steps needed to have a 
FreeBSD virtual server on the EC2 using PC-BSD or 
Windows. 


What you will need: Amazon AWS account, Web- 
browser, Terminal (PC-BSD), PuTTY, PuTTYgen 
(Windows) 


Announcements 


Service Health 


f ee An amaron CON COMmpaty 





Figure 1. Amazon AWS Management Console 
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Installing FreeBSD on Amazon AWS EC2 Cloud Services 


3.) In order to log into your EC2 virtual server using Terminal or Pul TY, you will need to create a Key Pair. 
Inside the EC2 dashboard under NETWORK & SECURITY, click on Key Pairs and then Create Key Pair > 
“name_your_key” > finish by clicking Create. (Figure 2) 








Figure 2. Creating a Key Pair using PuTTYgen 


4.) Next you will be prompted to download “your_ 


key.pem’. (Windows) Once downloaded — run 
Pul T YGen.exe. From the file menu choose /oad private 
key. Open and load “your_key.pem’. Once loaded (Figure 
3) click on save private key button. 











Are you sure you want to save this key 
without a passphrase to protect it? 











| Confirm ¢ 

, Actions 

| Garneau ott yt 
lod an ett vate eye 

) 
Save the generated key 
Parameters 

r Type of key to generate: 

(©) SSH-1 (RSA) @ SSH-2 RSA (© SSH-2 DSA 
Number of bits in a generated key: 1024 


Pe __ PulTY Key Generator 2 | 
File Key Conversions Help = 
) Key | 
| Public key for pasting into OpenSSH authorized_keys file: 
sshisa a 
PuTTYgen Warning 35 





Figure 4. Saving PuTTY key without a pass phrase 
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File Key Conversions Help 
Key 
Public key for pasting into OpenSSH authorized_keys file: 


sshtsa - 
AAAAB3NzaC lyc2EAAAADAGABAAABAQC8ir2+vuilLpGUQyd8 AC3Pqxvai TOTLFUpr Ee 
CE eee es ae 

miMrBeeqQuOGfoflA NsS lkeOWWjE3wHyLgsidLCxbluwk ToAXOGHHu 1LSEcrMaU 
DpK54dgt WsBP4R7Su4o0P thnj80PueBhWchuBgiS Uv339ML VijF4c/CzU2KwiGiuX7KA + 
Key fingerprint: ssh+sa 2048 £4:41:9F-3d:96-a 1:56 41:5F-a4:c8:0b:8c:10:5d: If 
Key comment: import rted-openssh-key > a 
Key passphrase: 
Confirm passphrase: 


Actions 






































Gare bee ys 
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Parameters 


Type of key to generate 
© SSH-1 (RSA) @ SSH-2 RSA © SSH-2 DSA 
Number of bits in a generated key: 1024 






Figure 3. Creating a PulTY compatible key pair 


5.) (Windows) When prompted Are you sure you want 
to save this key without a passphrase to protect it? 
Click Yes button (Figure 4). You will now have “your_ 
key.ppk” file which can be used with PuT TY and the EC2 
services. 


We are now ready to build a FreeBSD virtual server 
on EC2. 
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Figure 5. Creating a new (AMI) virtual server 
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Figure 6. Choosing a FreeBSD (AMI) 


FreeisD/EC2 8.2b-RELEASE 1386/XEN (ami-b55199dc) 
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Architecture: 086 
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Instance Details 


Figure 7. Configuring and launching your new (AMI) 
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6.) Inside the EC2 dashboard 
click on the /aunch_ instance 
button. Next Name Your Instance 
and under Choose a Launch 
Configuration click on More 
Amazon Machine Images then 
click Continue. (Figure 5) 


7.) Type “freebsd” into the search 
box and click Search. All FreeBSD 
instances will be listed. In this 
tutorial we ChoSe rreepsD/EC2 8.2b- 
RELEASE 1386/XEN FreeBsD ba2b= 
RELEASE AMI for tl.micro instances 
(ami-b55£99dc). Once you've 
chosen your Amazon Machine 
Image (AMI) click Continue. 
(Figure 6) 


8.) You may edit your virtual 
server settings by clicking on the 
Edit Details. The default settings 
will suffice for this tutorial. Click 
Launch. (Figure 7) then close the 
Create a New Instance window. 
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Figure 8. Copy and paste your Public DNS URL 
10.) (Windows) Start PulTY 


ANS Management Console » Amazon EC2 wees eee | and in the Host Name. textbox 
Bienen) paste your Public DNS address. 








navies Gincarmnc renmcrscom {ey Seeme Sow eee | In the Category menu, under 
a wee 
ee | ewig: Al lnskenced * }¢ @ te dines P| : ss 
C2 Dashboard coe Rlceee ae ts nara our PTT smn Monitoring Security Groups Key Pairtiaone vies | COnnection, click SSH, and then 
Everts id } i Soe i ae a eT eT Be - ade hawt - - 
= mstnces aa teen ——— =“ | Auth. The options controlling SSH 


Soot Requests 7) hae gg pebticenr 
alieteid PLansed 


hw Clad Oren im Css fe bee | eteomehs | eee “| authentication will be displayed. 


ri Led, aren ne delete o aeed bese . 
ol vs Seve Sere (Figure 9) 
Burts Tasks 7 — 

Casta Senge 
= PLASTIC ROCK STOR 
‘Veoh 4. 


anapthoty 


(PC-BSD) Locate ,your_key.pem” 


sears cet file, next you will need to set 

we | aoe permissions for your key. # chmod 

rata tees eee seal . TReaT Te : 30,210.155.46 2 400 your key.pem (must not be 
ee publicly viewable to work) 


acest tetas #}| (PC-BSD) # ssh -i your” 


Mad. 20 eee Peart 1 eee iapapetet Praksy Pthey ge ee | AD Sr DenooeT CQ ey 











i ; ; key.pem root@ec2-1234-13.your 
Figure 9. Loading “your_key.ppk” into PUTTY . 
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PulTY Confi r : . 
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Figure 10. Connecting to EC2 via PuTTY 


11.) (Windows) Click Browse and navigate to the PuTTY 
private key file you generated in the previous section. 
Click Open. (Figure 10) 
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Figure 11. Connecting to EC2 via PuTTY 
































ec2-50-19-157-O0.compute-1.amazonaws.com - PulTY — (antes 


| Eile Edt Jabs Help 
FreeBSD 8.2-RELEASE (XEN) #0: Thu Aug 4 15:39:43 UTC 2011 a 


Welcome to FreeBSD/EC2! | 


Please be aware that this code has not been extensively tested, and it 
should only be used in production environments with caution. For details 
about the current state of FreeBSO/EC2, please see 

http: //www. daemonology.net/freebsd-on-ec2/ 


Users are encouraged to test extensively and report any bugs found via 
send-pr(l) or to cperciva@FreeBSO.org (or both) 


This AMI is configured by default to send email to the AMI author when it 
is first launched and when rebooting after a panic It 1s hoped that this 
wall assist him in tracking down bugs and assessing the stability of the 
platform. The information sent is quite minimal, so please still send an 
email / file a PR if you see a panic (especially if you can reproduce it) 


Shameless plug: The maintainer of the FreeBSD/EC2 platform operates, and 
makes his Living from, the Tarsnap online backup service. If you're looking 
for secure online backup which Just Works, check at out! 

http: //www. tar snap.com/ 
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Figure 12. You are now connected to your virtual server 


13.) You are now connected to your virtual server on 
EC2. Cheers! 
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Replication 


PostgreSQL: 


In the previous articles we saw how to set up a PostgreSQL 
cluster, how to manage backups (either logical or physical) and 
how internally transactions work. In this article we will see how 
it is possible to replicate a running cluster to another instance 
in order to have a fully mirrored and active “stand-by” node. 


What you will learn... 

« What is replication 

¢ how to set up intra-cluster replication 
« howto monitor a replica slave 


ost of the configuration will be done by simple 
\/ shell scripts in order to both show required 
instructions and to allow readers to replay several 
time the experiments. Please note that configurations 


shown in this article are for didactic purposes only, and 
represent only a starting point for replica environments. 


Glance at Replication 

Replicating a database cluster means setting up an on- 
line clone of such “master” cluster: the clone cluster 
must be kept up to date with the master one and must 
be available as soon as possible in the case the master 
node fails. Replication opens the door for High Availability 
(HA): you can replicate your main database on a slave (or 
more) to be sure that in the case the master node crashes 
or becomes unavailable the slave one can be promoted 
and substitute immediately the failed master. 

Since the main idea behind replication is to have clones, 
it is strongly recommended to setup as much as possible 
similar environments, from the PostgreSQL instances to 
underlying operating systems and hardware. Ideally you 
can place replicating nodes wherever you want, starting 
from the same machine to different and far away systems. 
Of course the reliability of replication (and therefore of High 
Availability) is built on top of the reliability of the hardware, 
of the operating system, of the network connection, and so 
on. The configuration of the replication network depends 
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What you should know... 

« basic SQL concepts 

« basic PostgreSQL concepts and configuration parameters 
¢ basic shell commands 


on the aim you want to achieve, starting from a simple on- 
line backup node, a development environment, to a High 
Availability system (HA). 

There are several replication solution for PostgreSQL, 
either OpenSource or commercial. This article will show 
only the natively supported replication methods available 
since the 9 series. It is worth noting that PostgreSQL 
replication is mature, since it comes out after years of 
development and testing, and is therefore production 
ready. It is also worth noting that PostgreSQL versions 
prior to 9 had external replication tools, and many of 
them still exist and are available to DBAs that have 
therefore a large set of choices depending on their needs, 
environment, and expertise. 


Setting up the Environment 
In order to enable replication you must have at least one 
cluster with the role of “master” and one with the role of 
“stand-by”. For the purposes of this article we will use 
instances on the same machine: /postgresql/cluster1 IS 
the master and is configured via rc variables: /postgresql/ 
clusterx (being X a number) will be the stand-by instance 
running on the same machine but on different TCP/IP 
ports. 

In order to allow readers to experiment with replication 
a couple of shell scripts has been built; such scripts 
configure and test a replicating standby node (identified by 
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its number) running on the local machine. In particular the 
X cluster will be listening on a TCP/IP port increased by 
X with regard to the master listening one. In the following 
the details for each kind of replication will be discussed, 
but please consider that cluster numbering will have the 
following meaning: 


¢ cluster 1 is the master node (listening on port 5432); 

¢ cluster 3 is the log shipping replicated stand-by 
(listening on port 5435); 

¢ cluster 4 is the log streaming replicated stand-by 
(listing on port 5436); 

¢ cluster 5 is the Hot Standby replicated node (listening 
on port 5437). 


Cluster number 2 is skipped because it was used in a 
former article to set up a Point In Time Recovery, and 
we don't want to mess the readers environment. Please 
also consider that for each replication example only the 
master (cluster 1) and the appropriate stand-by node will 
be active on the machine. 

Please note that in this article we are going to use terms 
like “master” and “stand-by” to refer to the currently active 
cluster and the replicating one, even if in a replication 
environment such distinction is often useless, since a 
stand-by can be reconfigured to play as master and the 
master as stand-by and vice-versa. Moreover, readers 
will find the term “promote” to indicate a switch-over of a 
stand-by node, that is going to be promoted to the role of 
master due to a master crash or failure. 

The script 01-createAndConfigureStandby.sh (see Listing 4) 
creates the configuration of the stand-by node; it accepts 
the node number and the type to replication to configure. 
The script oo-workload.sh (see Listing 5) provides a 
simple workload on the master and stand-by node to 
see how replication is applied. In the examples below 
the machine hosting all the cluster has an IP address of 
192.168.200.2. 


A WAL to rule them all! 

The solution PostgreSQL adopts to perform replication is 
elegant and efficient: each standby node is instrumented 
by the master to apply the same changes to the data so 
that, after a delay, the standby will contain the same data 
as the master. Such instrumentation is performed through 
the WAL logs (Write Ahead Logs), that contain a snapshot 
of the data and its changes performed by transactions. 
Sending the WALs to the standby nodes will allow each 
node to replay transactions done on the master to follow 
it during the data changes. The concept is really similar to 
the physical backup performed in the first article (see in 
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particular PITR): a standby instance replays transactions 
done on the master using the WALs (similarly to what is 
done in case of a crash). The standby node continuously 
replays transactions, keeping updated with the master 
node. 

Please note that each stand-by node must start from a 
physical copy of the master cluster, therefore each kind 
of replication must perform a pg start backup(), physical 
backup and pg stop backup() On the master before the 
replica can be started (see Listing 4). 


Log Shipping Replication 

The first method explained here is “Log Shipping” 
replication. The configuration and the concept is really 
similar to the one behind the PITR discussed in the first 
article of this series. The master node archives the WALs 
over a location available to the standby node(s); the latter 
fetches the WALs and replays transactions continuously 
(see Figure 1). In this example the /postgresql/pitr 
directory IS used aS a common archive of the WALs, so 
the postgresql.conf file for the master node will include 
the following options to archive the WALs in the above 
directory: 


wal level="archive’ 
archive mode=on 


archive command=’cp -i %p /postgresql/pitr/%f’ 


The standby node will act as a pure clone, so it will not 
archive WALs by its own and therefore the postgresql.conf 
file contains: 


wal level=’minimal’ 


archive mode=’ off’ 


lt is important to instrument the standby node that it is 
actually a standby and not a stand-alone master server, 
as well as where it can find archived WALs. To do this, a 
recovery.conf file is created: 


standby mode=’ on’ 

restore command=’cp /postgresql/pitr/%sf %p’ 

archive cleanup command=’pg archivecleanup /postgresql/ 
pitr Sr’ 


trigger file=’ /postgresql/standby.3.trigger’ 


The first option (standby mode) informs the cluster that 
it is acting as a standby node, and therefore it will be 
continuously replaying WALs transactions. The restore _ 
command IS used to fetch master’s archived WALs. Of 
course this command must be the counterpart of the 
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archive command specified in the master postgresql.conf 
and can be any command you want depending on where 
and how access to the WALSs is provided. The archive _ 
cleanup command IS an optional directive: it specifies a 
command that will be run over already replayed WALs; 
it is usually a command to remove old WALs to recover 
disk space. The pg_archivecleanup COMmand does 
exactly that, removing already processed (sr) WALs 
and removing them from the disk. It is worth noting that 
pg_archivecleanup COMmand does not know how many 
standby nodes are fetching WALs from the repository. 
In other words, imagine that more than one standby 
nodes are configured as stated above over the same 
WALs archive path: one standby will potentially erase 
WALs not yet processed by the other standby node. To 
solve this problem DBAs have to carefully configure the 
WALSs archive storage, for example keeping a separate 
directory for each standby instance so that each instance 
can perform its own cleanup without damaging other 
standby node activities. 

The last directive, trigger file, Is used for the standby 
node promotion. In fact, once a standby cluster is 
configured as above, it is running but it is not accepting 
user connections. In other words, trying to connect to the 
standby cluster will result in an error message: 


S$ psql -U bsdmag -p 5435 bsdmagdb 
psql: FATAL: the database system is starting up 


The database cluster is already started, but it is not 
ready yet, since it is continuously replaying WALs. Once 


PostgreSQL master cluster 





/postgresql/clusterl 


a” 


Tht naster node archives the WALLS 





Figure 1. Scenario for log-shipping replication 
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/postgresql/pitr 


the standby node finds the trigger file, it stops replaying 
WALs and become its own life as standalone cluster: 
from now on, the master node and the standby one are 
separated. The idea is that the triggering file can act as 
a notifier that something went bad on the master site and 
that the standby must substitute it. Readers can develop 
their own programs to query the master and, in case of 
failure, trigger the standby node. In the examples shown 
here promoting the standby node is done using the 
following command: 


S$ touch /postgresql/standby.3.trigger 


Actions for performing a log shipping replications are 
summarized as follows: 


¢ configure the master node to archive WALs; 

¢ perform a physical backup of the master node in 
order to create the standby cluster; 

¢ configure the standby cluster to not archive WALs 
(more on this later); 

¢ create a recovery.conf file that will instrument the 
standby node to act as a standby and that contains a 
command to fetch (and possibly delete) WALs; 

¢ when required, generate the triggering file to let the 
standby node to detach from the master and start 
accepting user connections. 


In this example the standby node has been configured 
to not archive WALs by its own; this means that once the 


standby has been promoted and the master recovered 
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WHERE 


Ottawa, Canada 


WHEN 


09-10 May - tutorials 
11-12 May - conference 


WHO 


Contributors, developers, and users 


VENUE 


University of Ottawa 
http:/Awww.uottawa.ca/ 


AT FEES YOU CAN AFFORD 


We plan to keep costs to a minimum. As 
such, the conference will be held at 
University of Ottawa and accommodation is 
available within the University residences. 
Hotels are also within close walking distance 
of the conference venue. 


WHAT DOES IT COST? 


Type CAD 
Individual $195 
Corporate $350 
Additional Corporate $175 
Student $60 


Tutorial (per half day) $60 


Comfortable accommodation is available on 
campus at very reasonable rates. See our 
website for details. 


Take the BSDA Certification exam. 
For details see 
http://bsdcertification.org/ 
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distribution of audit trail files 
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¢ FreeBSD on Microsoft Hyper-v 

¢ BSDA Certification 
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¢ Bullet Cache 
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from its failure, the two databases will remain not 
aligned. Depending on the environment you want to 
set up, the standby node could archive its own WALSs 
so that the recovered master could be configured 
to play as standby; of course other scenarios are 
possible. 

Listing 1 shows how to see log-shipping replication 
in action: the standby cluster is the number 3 and the 
workload will insert one million tuples in the magazine 
table and will generate a test table with 500 thousand 
tuples. The final result is that both clustres have the same 
data: 


Activating the stand-by node... 





Tuples in the master node 


1000000 


(magazine table) 


Tuples in the master node 


500000 


(test table) 





Tuples in the slave node 


1000000 


(magazine table) 


Tuples in the slave node 


500000 


(test table) 





Log Streaming Replication 

In the log-shipping replication WALs were transferred 
from the master to the standby node(s) using an external 
tool and storage, in the example a shared directory and 
the cp(1) Command. Log streaming replication on the 
other hand does not require a WALs shared archive, 
since each standby node can autonomously ask the 
master node for the WALs using a TCP/IP connection 
(see Figure 2). In this way there is no need for setting up a 
shared repository, which could be another point of failure 
in the whole system. Moreover, all the system is fully 
PostgreSQL based, since the connections are performed 
from a PostgreSQL instance to another instance directly. 
For security reasons, the master cluster requires a specific 
user with the rerxicarron grant to be used for incoming 
WAL requests. Standby nodes will connect to the master 





Listing 1. Configuring and running alog shipping based 
replication node 


~> sh 01-createAndConfigureStandby.sh 3 logshipping 
“2 /use/ local bin, pogenl Dy poseouesgleliustersesirart 


7 si U0-=workl@adsenh 3 promote 
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using such user (it is possible to share the same user or 
create a per-standby user). To create a replication user 
on the master node a command like the following must 
be issued: 


CREATE USER replicator WITH REPLICATION LOGIN CONNECTION 
LIMIT 1 PASSWORD ‘replicatorPassword’ ; 


Beside the rzpzicarion grant, the user must also be 
enabled for connection from the standby node, so the 
pg _ hba.conf file of the master node has to be modified 
with an entry that allows the standby node and the above 
user to connect for replication: 


host replication replicator 192.168.200.0/24 trust 


It is possible to restrict the Host Based Configuration 
to accept connections only from a single user instead 
of the entire network, as well as not trust (but ask for 
password); for the purposes of this example, the above 
configuration does suffice. 

On the master cluster the configuration changes so 
that it has a set of dedicated processes to serve WALS 
to standby nodes; in this case there is no need for an 
archive command to do a WAL archiving, so it is possible 
to place each command that will return a “true” status. 
The postgresql.con¢ file will contain therefore the following 
directives: 


wal level=' archive’ 

archive mode=on 

archive command="test 1 = 1’ 
archive Lameout=30 

max wal senders=1 


wal keep segments=50 


In the above, a single process is dedicated on the 
master site to accept and serve incoming standby WAL 
requests. The wal keep segments parameter instruments 
the master to keep a number of WAL files (each 16 MB 
long) in the pg_ xlog directory just in case a stand-by 
connection fails. In this way, the stand-by node can lately 
fetch WAL required to the replication process. In other 
words, a stand-by connection can fail no more than the 
time required to serve wal keep segments, after that there 
is no guarantee the replication will be successful since 
the master could have already deleted a WAL needed 
by the stand-by. Please note that the archive command 
is anything can return a valid status (Success) and that 
each 30 seconds a new WAL is forced to be written to 
disk (and therefore available to the standby nodes). 
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Listing 2. Configuring and running a log streaming based 
replication node 


~> sh 0Ql-createAndConfigureStandby.sh 4 logsstreaming 
= usta Nocely bam Pq ncrl a= Dy pocrgucsal, Cllucter4 ss rare 


~> sh 00-workload.sh 4 promote 











Similarly, the standby configuration does not require a 
recovery command to restore the WALs from an archiving 
location; consequently there is no need for an archive 
cleanup command too. All the standby node needs now is the 
parameters for the connection back to the master node, 
so to ask to the latter the WALs. This is specified in the 
recovery.conf file as follows: 


standby mode=’ on’ 
primiary -conninto=" host=192.168.200.2 user=replicator’ 


trigger file=’/postgresql/standby.4.trigger’ 


As in the log shipping replication, the standby node has 
to be instrumented to play as a replica and not as a 
stand-alone cluster (standby mode='’on’), and it will start 
living on its own as soon as the trigger file IS found. 
As readers can see, the standby node will connect back 
to the master using the primary conninfo properties, 
in particular the host IP address and the user for the 
connection. 

It is possible to configure a log streaming replication 
environment running the commands of Listing 2, that 
perform the above steps and launch a workload on the 
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master, promoting then the standby to a stand-alone. As 
in the previous case, the result of the workload is that 
both the master and the standby (now promoted) have 
the same number of tuples: 


Activating the stand-by node... 





Tuples in the master node (magazine table) 
1000000 

Tuples in the master node (test table) 
500000 





Tuples in the slave node (magazine table) 
1000000 

Tuples in the slave node (test table) 
500000 


Actions for performing a log streaming replications are 
summarized as follows: 


¢ configure the master node to archive WALs; 

¢ perform a physical backup of the master node in 
order to create the standby cluster; 

¢ configure the standby cluster to not archive WALs; 

¢ create a recovery.conf file that will instrument the 
standby node to act as a standby and to ask the 
master for the WALs; 

¢ when required, generate the triggering file to let the 
standby node to detach from the master and start 
accepting user connections. As readers can see, 


PostgreSQL standby cluster 








Figure 2. Scenario for log-streaming replication 
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Listing 3. Configuring and running a streaming replication node 


~> sh Ol-createAndConfigureStandby.sh 5 hotstandby 
Ue) local, bin) porcele— De poctguceqly clictenseatrane 
ci OU=Wworkload. chs  siiow 











the configuration procedure is almost the same of 
the previous case (log shipping), except for a few 
parameters. 


Making Stand-by nodes hot! 

In the two previous replication scenarios both the stand- 
by nodes were forced to not accept incoming client 
connections until their promotion. PostgreSQL allows 
a so called “streaming replication’, also known as “Hot 
Standby”: in this scenario the stand-by nodes accept 
incoming client read transactions (e.g., sezecr) but do 
not allow for a client to modify the data. In other words, 
the stand-by node can be used as a more active node in 
the network architecture, for instance for load balancing 
purposes. 

Configuration for Hot Standby is very similar to the 
one of the log streaming replication: the stand-by node 
asks for WAL logs to the master node, which in turn 
is informed to work in an Hot Standby configuration. 
Moreover, also the stand-by node knows to work in 
an Hot Standby configuration, and therefore accepts 
client connections and execute client statement in 
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read-only transactions (see Figure 3). The following are 
the parameters required on the master postgresql.conf 
configuration file: 


wal level="hot standby’ 
archive mode=on 

archive command='’test 1 = 1! 
archive. Lameour=30 

max wal senders=1 
replication: timeout=60 


wal keep segments=16 


Similarly to the case of log-streaming replication, the 
stand-by node will have a recovery.con¢ file that contains 
the working mode (Hot Standby) and where the master is 
(i.e., how to fetch WAL segments): 


standby mode=' on’ 
primary conninto=" host=192.168.200.2 user=replicator’ 


trigger file=’/postgresql/standby.4.trigger’ 


Moreover, the stand-by node needs a special option in its 
postgresql.conf Configuration file that informs the cluster 
to accept read-only connections: 


hot _standby=on 
It is interesting to see how the replica is going on: the 
SCript 00-workload.sh Can show the replication information 


if called with the show parameter as shown in Listing 3; 
in such case the output will be: 
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/postgresql/clusterl 


The standby node asks directly to the master 
the WALs using the replica connection (and Its user), 
then it replays WALs to stay In sync with the master. 


Client read-only 
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Figure 3. Scenario for Hot Standby replication 
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Listing 4a. The script to create a replica configuration from scratch 


e1/ bin/sh 


STANDBY CLUSTER NUMBER=$1 

REPLICATION MODE=$2 

REPLICATION SYNC=$3 

POSTGRESQL ROOT=/postgresql 

WAL ARCHIVES=${POSTGRESQL ROOT}/pitr 

MASTER CLUSTER=${POSTGRESQL ROOT}/clusterl 

DEST CLUSTER=${POSTGRESQL ROOT}/cluster${STANDBY CLUSTER_ 
NUMBER} 

RECOVERY FILE=$DEST CLUSTER/recovery.conf 


HO St Ee 2 hoon Ur 

HOST NET=192.168.200.0/24 

PEP itCALION WWSER="teplmearor! 

POSTGRESQL CONF TEMPLATE=/postgresql/ 
postgresql.conf.template 


REPLICATION MODE LOGSHIPPING="logshipping" 
REPLICATION MODE LOGSTREAMING="logstreaming" 
REPLICATION MODE HOTSTANDBY="hotstandby" 
REPLICATION MODE SYNC="sync" 
REPLICATION MODE ASYNC="async" 


# A function to set up the file system for the standby 
node 
# and to start the backup of the master cluster. 
Elene mastem() 
# stop the cluster if running! 
jus) lecall/ bin pg icv) sDESn ensiER “stepe 7 dew 
mudd 2561 
sleep 2 
soa eet 2DES! CLUSTER 
2 DESL CLUSTER 


meadir 


chown pgsqlipgsql SDEST CLUSTER 


echo "Starting physical backup of the master node 
[SBACKUP LABEL)" 

Cogley -Uspdscace SEC ed stare yoackupi hae Ten. 
SBACKUP LABEL');" templatel 

Gp -R S{MASTER CLUSTER)/* SDEST CLUSTER 

Die=bt ODEO) sChUSIERR, pa xileg/  SDEST JCLUSIER, ~ spad 
SDEST CLUSTER/recovery.* SDEST_ 
ChUSMAR/ “Walle 

psql su pdsqi-=c SELECT pd sstoe backup); 
templatel 

Schon Elyoteal backiip OreEne Milceer mode | BACKUE s 


LABEL] finished" 


# Creates the recovery.conf file for the standby node in 
Eiewease 
# of the log shipping. 
Gees KeCOneey lS eng Lec slaijgonimncg |) 
Em eo TREGEER HILE > /dev/null 2>&1 
echo svandbyamede= On) = oh ECOVERY VF ili 
echo "restore command='cp $WAL ARCHIVES/%f %p'" >> 
SRECOVERY FILE 
echo) “abehive cleanup comiand= "po arehivecleanup 
SWAL ARCHIVES %r'" >> SRECOVERY FILE 
echo "trigger file='STRIGGER FILE'" >> $RECOVERY FILE 


# Creates a replication user on the master node to allow 
Standby €oO Connect 

# using such user to retrieve log WALs. 

EGeAabe Ee PIC pO Uses CMMilAS wet a he MOlmne mmeigs( a 

# check if the replication user exists and has the 
replication :CAapebilT Eres 

REPLICATION USER EXISTS='psql -U pgsql -A -t -c 

) DHL) colceplres tion FROM po oles 
WHERE rolname = 'SREPLICATION 
USER';" templatel' 


tt) [2 WO REPLICATION USER EXISTS” ] 
then 
echo, Crecbing  enewtep lear Om Usem) keen ueCalmnON | 
USER 
psql -U pgsql -c "CREATE USER $REPLICATION USER 

WLTH REPLICATION LOGIN CONNECTION 
LIMIT 1 PASSWORD SS REPLIECAW LON > 
USER';" templatel 


else 
if [| SREPLICATION USER EXISTS = "t" | 
then 
echo “Replicacien User (REPLICATION USmR 
already exists on the master 
cluster!” 
else 
df 7 eREPUECATION USER EXSES = "i" | 
then 
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Listing 4b. The script to create a replica configuration from scratch 


echo "Replicatiom User RE PEICAT lON USER 
already exists but does not have 
replication capabilities! Altering 
the role...” 
peg! —U posql -c “ALTER ROME 
SREPLICATION USER With REPLICATION: 


templatel 
fi 


fi 


7 AOGS ali eenery sii sie OO saDacOlm 11 eMLOna LOW selec 
connection of the slave 
# for the replication. 
ale, Gide (OCliOe Le iu Sxsins (| 
grep SREPLICATION USER ${MASTER CLUSTER}/pg_hba.conf 
>) dev / mule 2 > oil 
if [| S$? -ne 0 ] 
then 
echo Vadding an enery In ene Master pg hbanconr 
Scho, Nos feplicab on oREPLECAMlON USER (HOSTS 
NED Geis: ${MASTER CLUSTER} / 
PO. ada. cone 
fi 


# A function to check that the configuration of the 
master node is 
# appropriate for the log shipping replication. 
adjust Master conliguration for Veg shipping (){ 
local CONF=${MASTER CLUSTER}/postgresql.conf 
cp SCONF /postgresgql/postgresgl.conf.beforeLogShippi 
NG-e0 = /dev/null 2>e1 


# wal level = '‘archive' 

sed -i .bak "s/wal level[ \t]*=.*/wal_ 
level='archive'/g" SCONF 

# archive mode = on 

sed -i .bak "s/archive mode[ \t]*=.*/archive_ 
mode=on/g" SCONF 

# archive command => copy to pitr directory 


sed -i .bak "s,archive command[ \t]*=.*,archive_ 


command='cp -i %p /postgresgl/pitr/ 
oie ee SCONF 

# force a log segment every 30 seconds max 

sed -i .bak "s/archive timeout[ \t]*=.*/archive_ 


timeout=30/g" SCONF 


} 
# A function to check that the configuration of the 
master node is 
# appropriate for the log streaming replication. 
ecyUSe Masvenr comlgueaelon Bho log Psierc amu (), 
local CONF=${MASTER CLUSTER}/postgresql.conf 
cp SCONF /postgresgl/postgresgl.conf.beforeLogStream 
imgees 2 dev/null 2>61 


# wal level = '‘archive' 

sed -i .bak "s/wal level[ \t]*=.*/wal_ 
level='archive'/g" SCONF 

# archive mode = on 

sed -i .bak "s/#*archive mode[ \t]*=.*/archive_ 
mode=on/g" SCONF 

scl a alee S55 cue ell See mesta || \e) %=.* archive” 
command='test 1 = 1',g" SCONF 

# force a log segment every 30 seconds max 

sed -i .bak "s/#*archive timeout[ \t]*=.*/archive_ 
timeout=30/g" SCONF 

# esnure at least one wal sender 

sed -i .bak "s/#*max wal _senders[ \t]*=.*/max wal_ 
senders=1/g" SCONF 

# log connections, so we can see how is connecting 
to the master node 

Sed) —i bak “s/7 log conmections | it) “=. ~/ log) 
connections=on/g" SCONE 

sed -i .bak "s/#*wal_ keep segments[ \t]*=.*/wal_ 


keep _segments=16/g" SCONF 


# A function to check that the configuration of the 
master node is 
# appropriate for the hotstandby replication. 
adiUust Mas Ceereconeglnallony her Notstandoy (1 
local CONF=${MASTER CLUSTER}/postgresql.conf 
cp SCONF /postgresgql/postgresgql.conf.beforeHotStandb 
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Listing 4c. The script to create a replica configuration from scratch 


Veco ey ce, muNlelee2> oll # Restart the master cluster. 
GeStabile Maske lus keis()i 
f Wad evel = archive, cielo! WaieisiG ee ictlime) Jelae ilelsiecis Ie Wists 4. 
sed -i .bak "s/wal level[ \t]*=.*/wal_ level="hot_ ) eta, eecely we) enc! cece clue eciae 
Srancey 5) a. SCONF } 


# archive mode = on 


sed -i .bak "s/#*archive mode[ \t]*=.*/archive_ 


mode=on/g" SCONF # Creates the recovery.conf file for the standby node in 
sed -i .bak "s,#*archive command[ \t]*=.*,archive_ the case 
command='test 1 = 1',g" SCONF 
f EOECe dm OG eseONemn every no0 sSeconden max ISVS Ieee veiey Uke ities hoe) Sieiscelmlove(() | 
sed -i .bak "s/#*archive timeout[ \t]*=.*/archive_ rm STRIGGER FILE > /dev/null 2>61 
Limeour=30/¢" SCONF echo) “standby mode= "on! > RECOVERY IF 1hE 
yp estes cle lecisr eile Well siecle echo “Primary conningo=" Whest—.HOsl) iP 
Sec =i clock 8) jl well Seacleis | \ie) >. / mile well USer= SREPLICATION USER™™ >> 
Sene¢ero— ly Gl SCONF “RECOVERY | #iikik 
#elog Connections So WelcCan sce Now wis scouneCc sing Seno ebioge  iilo— IRTGCEe ihe a RECOVER (Mie 
to the master node } 


Sed 9-8 bak "5/7 ~ log Commections| \tl*—.*/llog) 


connections=on/g" SCONE 
W MeO welll seewlencs sls eivieic eine eicreson Greate peceovemy sie wlourlojustucam nee yie 
sed -i .bak "s/#*wal keep segments[ \t]*=.*/wal_ replication () { 
keep segments=16/g" SCONF rm $TRIGGER FILE > /dev/null 2>6&1 
echo “standby mode="on'"s> (RECOVERY 1 hk 
fPeCRMi Nace Wie TCOnnecelOns ain the Standoy sias echo. Primary yeOnninte=" est— hOs TT iP 
crashed User= jREPLICATION USER application — 
Sie Cis ra ee ey eco lnc cle mieg ie SO eliel ue ciel 7 name=SSYNC_APP NAME '" >> SRECOVERY_ 
replication timeout=60/g" FILE 
SCONF Seno Ter roge: tile— "> IRTGCER) PilheE >>) RECOVERY oF MLE 


} } 
# Activate the master configuration for sync replication. 
acdqustamMactenss ie Rpeprcalr Lomi) 7 

local CONF=${MASTER CLUSTER}/postgresql.conf 

SYNC_APP NAME="sync_ replication $STANDBY CLUSTER_ 


NUMBER" adjust. standoy (connguear Toni 
echo "Synchronous application name: 2SYNC APP NAME” local CONF=$DEST CLUSTER/postgresql.conf 
sed -i .bak "s/#*synchronous standby names[ \ Sed a Gbak “s/7 pore Sele) Vel (=o) 
El =. 7 synenronous standoy) port=$DEST PORT/g" $CONF 
names=SSYNC APP NAME/g" SCONF Se Ceo clam oy ieee in| ay ell == /eyienley 
Sed =i jbako "5/7 *Syncheonous commit) \tl*=.*/ level='minimal'/g" SCONF 
synchronous commit=on/g" SCONF sed -i .bak "s/archive mode[ \t]*=.*/archive_ 
sed -i .bak "s/#*replication timeout[ \t]*=.*/ mode='off'/g" SCONF 
replication timeout=1000/g" sed -i .bak "s/max wal senders[ \t]*=.*/#max wal_ 
SCONF senders=0/g" SCONF 
} Sed = bak “s/7> log connections | \ul*—.*7 log” 
connections=off/g" SCONF 
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Listing 4d. The script to create a replica configuration from scratch 


# A function to activate the hot standby for the standby 
node. 
acuivave Ou Sstandoy Vonpscandoy mode, 
local CONF=$DEST CLUSTER/postgresql.conf 
Sed i bak 15/7, notrstandby |) \t)=—./ hors 


standby=on/g" SCONF 


activave Nou standby iSync on standoy node (){ 
local CONF=$DEST CLUSTER/postgresql.conf 
Sed =i jbak "5/7 *hot standby | \t]*=.*/hot | 
standby=on/g" SCONF 
sed -i .bak "s/#*wal receiver status interval| 
\t]*=.*/wal receiver status_ 


interval=500/g" SCONF 


# Print some final instructions for the usage of the 
Rieaneoy. 
Cie rage mineuley queen) | 
echo "Standby node S$STANDBY CLUSTER NUMBER will 
DiSceneon Pore ~Pisl PORT 
echo "Execute the following command to change the 
Status on che Stamdby” 
echo "in order to accept incoming connections:" 
echo 
echo." 


touch $TRIGGER FILE : 


echo "To manage the cluster use:" 


echo 

econ. / sie) o@eally los je, Ciel =D) SINS CUS 
ESicznee 4 | Sion. 4 

echo 


echo "To run a workload please execute" 
echo 
cenoun sh 00-workload.sh SSTANDBY CLUSTER _ 


NUMBER [activate | show]" 


# check the number of arguments 
dan ers lhe 25] 
then 
echo "Usage:" 
echo) “20 <standby-number> <SREPLICATION MODE y 
LOGSHIPPING | SREPLICATION _ 


MODE LOGSTREAMING | 
SREPLICATION MODE HOTSTANDBY> 
iStamt|™ 

exit 


fi 


# check to operate on a cluster different from the 
east ea One 
rt a SSTANDBY CLUSTER NUMBER -eq 1 ] 
then 
echo "Cluster #1 is the master!" 
exit 


fi 


# compute on which TCP/IP port the standby will be 
accepting connections 

Diol SeORI— esqi 0 bsciage—a ec SakiCl *seruime 0M 
PoLsectrings WEE Renae r = wemr | 7 
templatel' 

DE PebOn i= exo SDEST PORT + SSTANDBY CLUSTER NUMBER' 

Echoo Pesci navlonepont Diet PORT 


# where will be the recovery file for this standby node? 

RECOVERY FILE=$DEST CLUSTER/recovery.conf 

# which recovery file to use to activate the node? 

TRIGGER FILE=/postgresql/standby.${STANDBY CLUSTER _ 
NUMBER}.trigger 


# which kind of replication should I do? 
case SREPLICATION MODE in 
${REPLICATION MODE LOGSHIPPING}) 
echo "Log shipping replication" 
BACKUP LABBh=S{REPLICATION MODE LOGSHIPPING} 
# 0) ensure the master has the right configuration 
Lor LOG sip oun¢ 
adjust Master \coniiquratien) ter log shipping 


Kestart, Master cluster 


# 1) clone the master into the standby directory 
filesystem 


elome Master 


€Geake wuecovery Miley ten log esnipping 
# 3) adjust the postgresgl.conf file in the standby 
node 


adjust standby configuration 
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Listing 4e. The script to create a replica configuration from scratch 


${REPLICATION MODE LOGSTREAMING}) 

echo "Log streaming replication" 

BACKUP LABEL=${REPLICATION MODE LOGSTREAMING} 

# 0) ensure the master node has the right 
configuration 

adyjusu Maswver coniqurarron tor Veg wseccami mg 


Se igs | Mesiz eta yCiUis tet 


# 1) clone the master into the standby directory 
filesystem 

eVonermasecrs 

# 2) create the recovery.conf file 

CEEAte wecovecy whe ioe ioc siiceeinaling) 

# 3) create the replication user on the master node 

esc S eel eele Loi Use On Tinleisweie ais ini SSeS 

# 4) allow the standby node to connect back to the 
master to allow for replication 

addment uy pghbay Th emotrexrsits 

# 5) set the standby configuration to not ship logs 

adVUstpobandoy seonligueat: 1om 


ee 
vy 


${REPLICATION MODE HOTSTANDBY}) 
echo “Hot Standby replication” 


BACKUP LABEL=${REPLICATION MODE HOTSTANDBY} 

# 0) ensure the master node has the right 
configuration 

adjust Master connguratron for Metstandby, 


besStare Maser eluster 


# 1) clone the master into the standby directory 
filesystem 

Clone wmasten 

# 2) adjust parameters on master and create recovery 
file on stand-by 

if [| "SREPLICATION SYNC™ = "SREPLICATION MODE SYNC” 
| 

then 

echo "Synchronous replication" 
adgqust Master ys ienseo hucat Lom 
Gwe ae mEgec ome a pile he myhome t Beale cmEay ml em 


replication 


else 
RISES IeeeOueioy Wilke ‘ele lee sieeseili ile) 


fi 


# 3) create the replication user on the master node 


GeSiuSs KS Ol Calm WSS Ol Wes wee WiC OW exes 


addwenery pguba ui mel exisirs 
#5) Set Ghe standby Coniiguration to Not ship logs 
adjust standby conligqurarien 
agUivarewiOurs tandoy onestandoy nede 
2f [ "OREPHICATION SYNC” — *SREPLICATION MODE SYNC™ 
] 

then 

Ge stageMcsiscra yells ters 


i 


echo "Cannot proceed without the replication 
method" 
exit 


esac 


PEIN tinal inie 


# shoudl I start the standby node? 
Tf | Uso" = “stark” | 
then 
echo "Stareing iene sSeaemdby nede ODES TECLUSIER” 
Psi lecalk/ loimy pqqcelyD ODEs ClhUstER i sirare 
fi 
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Showing replica information 
=====>; MASTER ======— 


1116 postgres: archiver process 





===== STANDEY SS==s5= 
1150 postgres: startup process 





xlog location: master C9AF5F8 (9999598) 
xlog location: standby C9AF5F8 (9999598) 


Difference is 0 


On the master node there is an “archiver” process 
that is in charge of keeping the WALs and serving 
them to the stand-by nodes; on the other hand there 
is a “startup” process that is replaying the WALs. It 
is possible to see the last WAL segment written by 
the master using the pg current xlog_ location() 
function on the master node and opg_ last xlog_ 
replay location() On the stand-by. As shown in the 
above output, the difference between the segments 
is zero, meaning that the stand-by is aligned with the 
master node. The above script continuously prints the 
synchronizing information until the difference between 
the WAL segments is zero. 

Actions for performing a Hot Standby replications are 
summarized as follows: 


¢ configure the master node to archive the WALs for 
Hot Standby; 

¢ perform a physical backup of the master node in 
order to create the standby cluster; 

¢ configure the standby cluster to not archive WALs 
and to act as an Hot Standby node; 

¢ create a recovery.conf file that will instrument the 
standby node to act as a standby and to ask the 
master for the WALs; 


when required, generate the triggering file to let the 
standby node to detach from the master and start 
accepting user connections. Again, the configuration 
is almost the same as in the previous replication 
scenarios. 

The Hot Standby described above is also called 
asynchronous replication, since the master node does not 
keep track of the stand-by status directly. In other words, 
once a transaction on the master is committed, the master 
immediately make the transaction persistent, even if the 
stand-by nodes have not yet required or replayed such 
transaction. What could happen now is that the connection 
between the master and the stand-by nodes goes down, 
making it impossible for the stand-bys to fully replicate 
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Listing 5a. A workload script 


#/bin/sh 


STANDBY CLUSTER NUMBER=$1 

STANDBY OPERATION=$2 

POSTGRESQL ROOT=/postgresql 

WAL ARCHIVES=${POSTGRESQL ROOT}/pitr 
MASTER CLUSTER=${POSTGRESQL ROOT}/cluster1l 


DEST CLUSTER=${POSTGRESQL ROOT}/cluster${STANDBY CLUSTER_ 


NUMBER} 
RECOVERY FILE=$SDEST CLUSTER/recovery.conf 
STANDBY OPE Rad ON PAC TiVvath > Promore™ 
STANDBY OPERATION SHOWREPLICATION="show" 


if [ S# -le 0 ] 
then 
echo "Please specify the number of the cluster to 
configure!" 
echo 
Seno Weage, se 0) -elUerer number le Pans ie 
OPERATION ACTIVATE | SSTANDBY _ 
OPERATION SHOWREPLICATION ]" 
exit 


fi 


if | Sie eq 14] 
then 

STANDBY OPERATION=SSTANDBY OPERATION ACTIVATE 
fi 


Le | SSTANDBY CLUSTER NUMBER -eq 1 ] 
then 
echo "Cluster #1 is the master!" 
exit 


fi 


Dect eeOnl— pedl yu bsdmag {st =e) SehiCl Wetting 
PROM pg settings WHERE name = 
"oort';" templatel' 

DEST PORT='expr SDEST PORT + SSTANDBY CLUSTER_ 
NUMBER! 


echo "Operating mode is $STANDBY OPERATION" 
echo, The stand-by mode > slANPBY CLUSTER y NUMBER ais 
PiSsteming One. Dio) —eORt” 


TEST TABLE NAME="test$$" 

COUNT JOUER Y MAGA ZINE="ShnBCL count(*) PROM magazine; ™ 

COUNT QUERY TEST="SELECT count (*) FROM STEST TABLE 
NAME ;" 


echo "Inserting tuples into the master node" 

psql -U bsdmag -c "TRUNCATE TABLE magazine;" bsdmagdb 

psgql -U bsdmag -c "INSERT INTO magazine(id, title) 
VALUES Generaue senies | l 1000000);, 
Tho l=REPLICA'):” bsdmagdb 

echo "Tuples in the master node (magazine table)" 

psql -U bsdmag -A -t -c "SCOUNT QUERY MAGAZINE” bsdmagdb 

psgql -U bsdmag -c "CREATE TABLE STEST TABLE NAME (pk 
serial NOP NULL, description text) ;" 
bsdmagdb 

pedi =U bsdmag, 2) 2) Viera INl@ oles) 
TABLE NAME (pk, description) 
WALUINS (cieiiersics sees (1, SO0000) 
"NEW-TABLE-TEST');" bsdmagdb 


if [| "SSTANDBY OPERATION" = 
ACTIVATE" ] 


"SSTANDBY OPERATION _ 


then 
echo "Activating the stand-by node..." 
sleep 30 
touch /postgresql/standby.${STANDBY CLUSTER _ 
NUMBER}.trigger 
sleep 10 





echo "“SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS= 

echo "Tuples in the master node (magazine table)" 

psql -U bsdmag -A -t -c "SCOUNT QUERY MAGAZINE” 
bsdmagdb 

echo "Tuples in the master node (test table)" 


pedi Ulead) ee) COUNT JOUER Sieh sibs dmagdlb 


echo "Tuples in the slave node (magazine table)" 

Pog -U bsdiag =A yp DEST PORT =o) COUNT FOUBRY § 
MAGAZINE" bsdmagdb 

echo "Tuples in the slave node (test table)" 

Pegi) -U bsdnagy-- =p Dest PORT —c 7 COUNT QUERY) 
TEST" bsdmagdb 





Oa 

else 
if [ "SSTANDBY OPERATION" = 
SHOWREPLICATION" ] 


"SSTANDBY OPERATION _ 


then 
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Listing 5b. A workload script 


WALSDIFEERENCE=1 
while [ SWAL DIFFERENCE = Gir awed 
do 


echo "Showing replica information" 


# show log shipping processes 


Pgeeorsn ly ah caArchiver 


pgrep -f -l -i postgres | grep sender 





echo Ww —————— STANDBY ——————— 
DOG C seta Peotone 


pgrep -f -l -i postgres | grep receiver 





Jc) ———————————— 


MASTER A nOCy hOCArION= peg UU pgsgi =A 
Ste SEIECI pg curnene hog) 
location();" templatel | sed 
bet Olea Oho ihe eciaag 

STANDBY LOG LOCAITON— psaly -Utegsql se 


xlog teplay location (;  cemplacel || 
Sede es sats) ice 


obase=10 

ibase=16 

export obase 

export ibase 

MASTER XLOG LOCATION 10='echo SMASTER_ 
ALOG LOCATION | be™ 

STANDBY XLOG LOCATION 10="echo $STANDBY_ 
XLOG LOCATION || be" 

echo "xlog location: master S$MASTER XLOG_ 
LOCATION (SMASTER XLOG LOCATION 10)" 

echo =xlog locarion. seandby sSTANDBY) 
XLOG LOCATION (SSTANDBY XLOG _ 
LOCATION 10)" 

WAL DIFFERENCE='expr $MASTER XLOG_ 
LOCATION 10 - SSTANDBY XLOG_ 
LOCATION 10! 

echo) Dit rerence 15 sWAly DIP rERENCE: 


done 


fi 
fi 





Uist ee OR etc Hel Poe lajes 








www.bsdmag.org 


the master status. To avoid this problem, starting from the 
9.1 release, PostgreSQL supports also the synchronous 
replication: each time a transaction is committed on 
the master, the master waits an acknowledge from the 
stand-by nodes indicating they have also applied such 
transaction. As readers can imagine, having a master 
node waiting indefinitely for a stand-by acknowledge can 
quickly become a failure (e.g., if the stand-by nodes are 
not able to send back their acknowledge), and therefore 
the master node will wait only for a specified amount of 
time, after that will apply anyway the transaction commit. 

The synchronous replication is configured using a 
list of stand-by names the master will be informed are 
replicating it. Only one of multiple stand-by nodes will 
be queried for acknowledge commits by the master, and 
other stand-by will act as synchronous stand-by nodes 
once the former stop responding to the master. In the 
master postgresql.conf file the following parameters must 
be configured: 


synchronous standby names=sync replication 6 
replication timeout=1000 


synchronous commit=on 


in such case the stand-by node will be named sync _ 
replication 6. TN@ replication timeout parameter allows 
the master to discard a replication connection that has 
been inactive for the specified number of milliseconds. In 
the recovery.conf Of each stand-by such name must be 
repeated: 


standby mode=’ on’ 
primary conninfo=" host=192.168.200.2 user=replicator 
application name=sync replication 6 * 


trigger file=’/postgresql/standby.6.trigger’ 


Finally, in the stand-by configuration it is appropriate to 
place a configuration parameter like the following: 


wal receiver Ssvratus inverval = 500 


in order to send back to the master information about the 
replication status of the stand-by. This is useful to avoid 
the master to drop the replication connection too early in 
the case of a network delay. 

The above configuration can be obtained again using 
the script of Listing 4 and passing the last argument sync 
as shown below: 


~> sh 01l-createAndConfigureStandby.sh 5 hotstandby sync 
~> /usr/local/bin/pg ctl -D /postgresql/cluster5 start 
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On The Web 

- PostgreSQL official Web Site: http://www.postgresql.org 
ITPUG official Web Site: http://www.itpug.org 
PostgreSQL Replication Documentation: http://www.postgresql.org/docs/9.1/interactive/high-availability.html 
PostgreSQL Streaming Replication Wiki: http://wiki.postgresql.org/wiki/Streaming_Replication 
Replication solutions comparison: http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_ Pooling 
Londiste Tutorial: http://wiki.postgresql.org/wiki/Londiste_Tutorial 





If now the master node cannot be sure of a replicated 
transaction on the stand-by, a message similar to the 
following one will be reported in the logs: 


The transaction has already committed locally, but might 


not have been replicated to the standby. 


This means that there is no absolute guarantee that the 
stand-by is still replicating the master node. 


General Considerations 
Log streaming replication and Hot Standby are really 
similar from an operative point of view, and therefore 
some considerations related to their configuration can 
be common. The first one is about the WAL archiving 
performed by the master node: the WAL segments 
must be kept enough time to allow the stand-by to ask 
them again in the case of a network failure. Therefore 
the parameter wai_keep segments Must be set accordingly, 
considering also the disk space available and the fact that 
each segment will occupy 16 MB. A second consideration 
is about the max wal senders parameter of the master node, 
that must be at least the number of stand-by nodes that 
are supposed to be replicating the data. Such parameter 
in fact provides the number of processes that are going to 
serve WAL requests by the stand-by node(s). 

lf something goes wrong between the master and the 
stand-by node, for instance the master stops responding 
due to a network problem, the slave will notify this in the 
logs: 
FATAL: replication terminated by primary server 
FATAL: could not connect to the primary server: 


could not connect to server: Connection refused 


If now the master node becomes available again, the 
stand-by node starts receiving again WALs and therefore 
will become up to date again. Nevertheless, it is worth 
noting how much it is important to monitor the status 
of the replicating clusters to ensure that everything is 
working fine. 

Please take into account that in streaming replication, 
either synchronous or asynchronous, the master node 


BSD 


MAGAZINE 


44 





will not honour a normal shutdown request until all the 
WAL segments have been served to the attached stand- 
by nodes. 

As already stated, transaction executed on hot stand-by 
nodes must be read-only, in particular no transaction id 
(xia) will be assigned to transactions started on the stand- 
by nodes, as well as such transactions will be unable to 
write data to the “local” WAL. 


Summary and Coming Next 

Thanks to the replication capabilities of PostgreSQL 
setting up a mirrored environment is_ straightforward 
and allows DBAs to provide an environment for High 
Availability. In the next article we step back to SQL to see 
some of the features available to DBAs and developers. 
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Interview with 


Mark Price, 


President of Tranquil 


Hosting, owner of RootBSD 





Technology entrepreneur, proven experience working with 
and integrating a wide range of systems and networks. His 
specialties are: server management, advanced networking, 
web hosting, Linux, BSD, open source software. 


INTO 


Can you tell our readers how it happened that 
you are where you are and you do what you do? 
MP: As a kid, | loved to play with computers, as most 
readers of BSDmag can probably relate to. | first explored 
both Linux (Slackware) and BSD by installing FreeBSD. 
Back then, you had to really want to try a new OS since 
it required ordering a CD-ROM or spending your nights 
downloading files from Walnut Creek FTP site on a slow 
dialup connection. As the growth of the web exploded, 
I've been putting my interest in Unix to work by building 
hosting services for our growing customer base. 


How it happened that RootBSD was created? 
MP: As you know, most mass-market web hosting 
services are built on Linux. BSD users seem to be a more 
‘do it yourself’ type who don’t necessarily just want to just 
run some automated installers and be like everyone else. 

Due to our interest in BSD we were just curiously searching 
for BSD hosts to see how BSD hosts were addressing the 
technical challenges of offering Virtual Private Servers 
(VPS), since a lot of the mainstream technology was made 
specifically for Linux. We were surprised to find that there 
weren't any popular BSD VPS options out there, so we 
decided to create one. It was mainly just a self-serving 
project, as we didn’t think many BSD users would want to 
sign up for such a service anyway. 


Why clouds? Why hosting solutions? Is 


there a greater idea behind it or it was justa 
coincident? 
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MP: Cloud hosting solutions are the most affordable 
for consumers. The cost of a comparable dedicated 
server is typically much greater than that of a virtualized 
server. 

Private Clouds: The amount of resources required by 
modern services can easily exceed that of a single larger 
server. Clouds allow for services to scale up past the 
limitations of... 


What kind of technical difficulties and 
problems you had to go through? 

MP: Specifically with developing our own BSD hosting 
service, we've had to deal with lots of little technical 
problems. At first it was with tuning and tweaking the 
jail code in FreeBSD. Later on it was with doing some 
complex networking configurations in Xen. And of course, 
we've also been through many different iterations of ways 
to deploy FreeBSD installations with automated tools 
while still addressing customer’s specific needs. These 
technical challenges are things that I, and the rest of the 
RootBSD staff, really enjoy. 


DEFINITION OF CLOUD 

What is cloud that we didn’t have before? 

MP: | don’t think cloud is anything specifically new. | think 
its just a term that describes the trend in businesses 
outsourcing more IT functions. My experience with IT 
people in general is that IT people are very possessive 
and territorial, wanting to have lots of servers doing lots 
of things in-house. The ‘cloud’ idea just says “OK, we are 
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going to outsource some of this boring technology stuff 
and instead concentrate on what's really important to us”. 


Is cloud in BSD something different than in 
other systems? 

MP: | think that cloud means something slightly different 
to each person, as well as to each developer and software 
architect. It is fascinating to see how BSD developers and 
users are building cloud solutions, and in many ways, the 
cloud is blind to specific operating systems. For example, 
Colin Percival has figured out how to run FreeBSD on 
Amazon's large EC2 cloud, even though Amazon doesn't 
offer FreeBSD support themselves. 


What are the advantages of cloud for BSD users? 
MP: Thinking of cloud can bring advantages to all BSD 
users and IT professionals. The days of spending days 
setting up and configuring a new server have been 
replaced by a myriad of rapid deployment technologies 
and all sorts of on-demand technology services available 
now that we couldn't imagine 15 years ago. 


IMPLEMENTATION 


Could you give us some samples and case 
studies where cloud has been and is being 
implemented using BSD? 

MP: Yes, there has been a lot of exciting development in 
BSD that ties in nicely to cloud concepts: 

Sun’s ZFS filesystem has been implemented in FreeBSD 
and is now incredibly stable and robust. ZFS offers all sorts 
of flexible ways to think about storage rather than being 
constrained to just legacy hardware solutions such as RAID 
cards which are designed for fixed storage configurations. 
ZFS is useful for scaling storage needs and also replicating 
storage. Jails have been used for years on FreeBSD hosts 
to segment applications into containers. The jail environment 
can make it easy for developers and administrators to easily 
“spin up’ and “spin down’ test environments without having 
to do traditional OS installation, create filesystems, etc. 
CARP is a great networking technology to allow multiple 
hosts to share an IP, an excellent example of using multiple 
hosts to increase overall availability. The CARP protocol 
was developed by the OpenBSD team and is now available 
in other BSD flavors. 


Can you tell us step by step how to build 
affordable and stable clouds using BSD? 
MP: There is no such thing as a “one-size-fits-all’ solution, but 
| would recommend looking at the jail functionality in FreeBSD 
to learn how to rapidly setup and take down jails. Taking this a 
step further, one can use ZFS to provide fault-tolerant storage 


www.bsdmag.org 





for a cloud of FreeBSD jails. This is something that anyone 
can build at home with just a few older PCs. 


SECURITY 


Where are weak points of security in cloud? 

MP: In a very broad perspective, one of the weak points 
is just in first deliniating where the security boundaries 
are. If you make use of cloud providers for running virtual 
servers and storage, what security steps is the provider 
taking for you? What is the customer’s responsibility to 
secure? I'd say this is the biggest weak point, the easy of 
use of some cloud technologies makes it easy to not think 
about security implications. 


How do your company handle security when 
dealing with the cloud? 

MP: We're always looking for ways to improve on security. 
We provide our new VPS customers with some general tips 
on how to keep up with securing their systems. From time 
to time, we will scan our whole network for certain popular 
vulnerabilities if we suspect a particular issue may be affecting 
customers. Taking it a step further, we offer custom solutions 
such as private clouds for businesses where we maintain and 
manage their environment with dedicated firewalls. 


How do you verify rumors/or not for 0-day 
exploits and such? 
MP: Hopefully, the systems have been setup from 
beginning to not allow any unneeded access or possible 
attack vectors. This is done by turning off unneeded 
daemons (fortunately FreeBSD is very good out of the box 
in this respect!) and closing off network ports with a firewall. 
When an exploit is uncovered, we think that the FreeBSD 
security team does a great job with evaluating the risk of 
potential vulnerabilities, and addressing them quickly. 

lf you're not already on the FreeBSD security mailing 
list, read more here: http://security.freebsd.org/. 


FUTURE 


What will be the future of cloud in BSD in your 
opinion? 

MP: Within the next few years the “cloud” will become 
ubiquitous across all platforms, including BSD. While there 
will likely always be a need for some internal or onsite 
deployment of resources, as time goes on we will see an 
even larger shift towards flexible hosted infrastructures. 
Components of the “cloud” such as virtualization, SaaS, 
and distributed storage are all quickly becoming the 
standard as opposed to the cutting edge. The benefits of 
cloud infrastructure far outweigh any potential pitfalls, so 
It's an inevitable conclusion. 
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The Greater Benefits of 





Open Source Software 


When Patrycja contacted me about contributing an article 
for the April 2012 issue, | didn’t have any material ready 
for print. Yes, | have a few ideas but they remain works-in- 
progress at the moment. Instead, | opted to write on what 
| think are the benefits of Open Source software. | did have 
difficulty finding statistics on BSD operating systems, and 


used Apache as an example. 


commercial manufacturers, Open Source software is 
written and perfected by volunteers, who freely share the 
programming code that would otherwise be kept secret. 
Under the terms of the most popular Open Source 
licenses, anyone can redistributed the software without 
paying fees to its author, and anyone can modify it if 
they distributed the new version under the original terms 
— Open Source and non-proprietary. In recent years, Open 
Source products such as the BSD family of operating 
systems (and Linux) have emerged as significant competitors 
to proprietary software products like Microsoft Windows. 
The rise of Open Source software and the challenge 
it poses to proprietary software producers is just the 
latest example of a long-standing tension between open 
and closed systems in the configuration of the global 
information infrastructure. Because the configuration of the 
information infrastructure has a strong impact on both the 
global economy and world politics, the rise of Open Source 
software is something that is best seen in a broader light, 
giving consideration to implications beyond the software 
industry. This phenomenon of Open Source software 
actually has quite a lot to teach us about some very general 
things — about the political economy of the Internet era, and 
actually, just as important, about the social process by 
which we are coming to understand it and the kind of policy 
process that is going to grow up around tt. 
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Let’s address Open Source as a market phenomenon, 
stating some of the basic facts and seeking to clarify some 
misconceptions that have emerged in recent treatment of 
the issue. 

First, Open Source software movement, per se, has barely 
begun. The method of collaboration involved in producing 
Open Source software dates back to the 1950s and 1960s, 
but its manifestation in the market today, and the legal and 
policy issues that accompany it, are both new. 

Second, discussions of Open Source have 
combined elements of truth with a lot of hype and 
misinformation. Open Source does not represent a 
“revolution in capitalism” or anything of the sort, and 
with proclamations like these it is easy to lose sight of 
what really is unique about the movement. To make 
matters worse, Open Source is sometimes actively 
misrepresented by stakeholders who feel threatened by 
its rise. One widespread misconception that has resulted 
is that Open Source software is inherently free software 
— a commodity that must be given away without charge 
and cannot be leveraged into a successful business 
model. Open Source software is free in the sense of 
freedom to view and modify its source code — not in the 
sense of zero price. For-profit companies such as BSD 
Virtual Machines have found ways to make Open Source 
software available to people by adding value in the form 
of customer service and support. 
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While proprietary software still dominates the market 
for personal computers (PCs), Open Source products 
are used widely on the servers that power the Internet — a 
development with profound implications for the software 
industry and Internet economy. As of March 2012, 65.2% 
of Web servers were running Apache, while only 13.8% 
ran Microsoft (according to Netcraft). Market share for 
the Open Source Apache server software is substantially 
higher than for any competing product, and Apache’s share 
is the only one that has grown over the past three years. 
This is a big deal. The fact that BSD may not run on your 
PC is not important: The desktop is like the steering wheel 
to your car, not the engine. The engine is the Internet, and 
increasingly it is built around Open Source software. 

Beyond its importance for the software industry 
and Internet economy, Open Source software is an 
interesting social phenomenon with broader economic 
implications. Two questions are particular puzzling for 
social scientists. The first concerns the micro-motivations 
of individuals in the Open Source software community 
— why do programmers contribute when they obtain no 
direct remuneration? Open Source software presents 
a collective action problem, since there are no clear 
individual incentives for participation. Part of the answer 
is cultural, sharing is a cultural norm in the software 
community, and software engineers see themselves as 
artists who are proud to show others the creative genius 
of their programming work. 

There are real economic incentives for participating in 
Open Source software development. Areputation as a good 
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programmer has great value for employees of commercial 
software firms, who may seek to change jobs in the future 
— yet the product of their paid work is only visible to a few 
other individuals at their current employer, and they receive 
no public credit for their efforts. In contrast, the credits file of 
a piece of Open Source software lists who contributed what 
to the final product, and potential employers can examine 
the source code to judge the contribution for themselves. 
| firmly believe writing Open Source code is probably the 
most efficient way to establish a reputation as a great 
programmer. And a reputation as a great programmer is 
something that you can make money on in other settings. In 
fact, this reputation-enhancing incentive lends an element 
of self-selection to the Open Source process, resulting ina 
better end product. The best programmers actively seek to 
showcase their work, while mediocre programmers would 
not want to bear their sub-par code to public scrutiny. 

A second interesting question about the development of 
Open Source software concerns the macro-foundations of 
the movement-— how does the group manage to coordinate 
the efforts of as many as 7,000 contributors? First, the 
development of Open Source software enjoys positive 
network externalities — it benefits from the participation of 
greater and greater numbers of individuals, even they do 
not directly contribute code to include in future versions. 
Every Open Source “free rider” now becomes at minimum 
a beta tester, who can find bugs, identify a new feature, 
and otherwise contribute to the collective good. The more 
copied and widely used a piece of Open Source software, 
the more valuable it becomes. 

Second, Open Source programmers divide a piece of 
software into a number of small, self-contained modules, 
each of which can be developed and perfected without 
knowing precisely how other modules function. Many 
contributors can thus work on different aspects of the 
project in parallel, minimizing the number of colleagues 
with whom each programmer must coordinate. Finally, the 
Open Source movement may have the outside appearance 
of an anarchic community, but in reality it includes formal 
decision-making structures and a set of shared norms to 
coordinate efforts and govern interaction. 

While the Open Source method is now actively 
revolutionizing the software industry, the mechanics of 
Open Source production have potentially much broader 
implications for the economy as a whole. The key to 
seeing beyond the software industry is to think of Open 
Source as a production process, where the software is 
simply a side-effect. There are four observations as to 
what this process tells us about the broader economy: 

First, Open Source is yet another demonstration of 
how the Internet facilitates geographically widespread 
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collaboration, reducing the costs of communication 
across great distances. 

Second, the movement has shown the value of 
distributed innovation and offers a new way to think about 
the division of labor in other industries. 

Third, Open Source software offers new ideas as to 
how to combine communities and commerce. The open 
software community is a relatively unique example of 
an open, value-driven community where participation is 
directed towards the creation of a significant product that 
travels outside the community itself. The model which has 
succeeded here may have potential in other areas of the 
economy. 

Finally, Open Source software is going to be a 
significant factor in the international economy in the 21st 
century. The production of BSD operating systems has 
involved programmers from around the globe, many of 
them from developing countries that have security or 
economic reasons to avoid using proprietary software. | 
think you can spin out an interesting story where Open 
Source software turns out to be a powerful instrument of 
development bootstrapping. Open Source software shifts 
the decision making prerogative into the hands of people 
in developing countries. 

In this past, Microsoft's reaction to the Open Source 
software movement bears resemblance to the reaction 
of national telecommunications carriers when the Internet 
began to emerge and threaten their entrenched positions. 
As in this earlier struggle, stakeholders have a lot of 
influence with policy makers, and they have the potential 
to leverage this influence into decisions that would be 
detrimental to the Open Source software movement. 
Microsoft poses a formidable challenge to the Open 
Source software movement. 

The company’s first move was to deny that Open Source 
posed any real threat to its interests, but it has since shifted 
to a two-pronged strategy: spreading fear, uncertainty, 
and deception about Open Source (the FUD factor); and 
embracing and extending the concept to render it less 
threatening. This latter tactic is epitomized by Microsoft's 
shared source initiative, in which it proposes to share its 
code with other large software companies but not allow 
them to modify it or distribute it further. Through use of 
these tactics, it would be easy for Microsoft to sow a lot of 
confusion about Open Source and get the policy community 
on its side. If it succeeded in changing copyright law or 
using patents to prevent reverse engineering of software 
products, for instance, this could have dire consequences 
for the Open Source movement. 

Let's address security of Open Source versus 
proprietary software. | have confidence that Open Source 
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software would be more secure than proprietary software 
because of the greater number of programmers working 
to expose and correct security flaws. A security problem 
in software is like a bug; it will be more transparent in 
Open Source software and therefore fixed more quickly. 
Some people, such as Alexis de Tocqueville Institution, 
have suggested that Open Source programmers (or 
terrorists) could intentionally build back doors (areas of 
security weakness that the programmers could exploit) 
into the modules they were designing. However, now that 
Open Source software is beginning to exist in a corporate 
setting there is greater incentive to control the security of 
the whole product. 

Now let’s look at Open Source software as a business 
model. Open Source software is currently popular only 
among more technically-adept computer users. How 
well would it fare among those that are less tech-savvy? 
This problem is slowly being solved in the marketplace 
by people at KDE and GNOME, who build user-friendly 
interfaces and provide customer service for Open Source 
software. Furthermore, all end users will appreciate the 
value of software that has fewer bugs and is less likely 
to crash. Another legitimate question is how members 
of the Open Source community (who write and perfect 
the software without compensation) feel about other 
companies making money off of their product. Most 
everything sold by companies is itself Open Source 
software. They are thus abiding by the ethic of the 
community, which stresses access to the source code but 
does not prohibit making money. 

Finally, there is the potential of Open Source software 
production in the developing world. At some point, you 
have to go beyond operating systems and develop 
applications; for this task people may have to be paid. In 
the developing world, software export is driven by an army 
of paid programmers. How well will Open Source work in 
this environment? Companies in the developing world can 
build proprietary applications for Open Source operating 
systems; this would be an effective model in developing 
countries that prefer Open Source operating systems for 
economic or security reasons. 

Many developing countries now train students with 
Open Source software because it is affordable and it 
allows students to understand how software works on the 
inside. Once they understand an Open Source system, 
it may be argued that these students can easily build 
proprietary applications to run on top of it. 


PAUL AMMANN 


Paul Ammann lives in New Fairfield, CT with his wife and 4 cats. 
You can reach him at http://bsdday.eu/2012. 
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